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PREFACE 


This  volume  is  part  of  a  16- volume  set  that  summarizes  the  research  accomplishments  of 
faculty,  graduate  student,  and  high  school  participants  in  the  i992  Air  Force  Office  of  Scientific 
Research  (AFOSR)  Summer  Research  Program.  The  current  volume.  Volume  9  of  16,  presents 
the  final  research  reports  of  graduate  student  (GSRP)  participants  at  Rome  Laboratory. 

Reports  presented  herein  are  arranged  alphabetically  by  author  and  are  numbered 
consecutively  --  e.g.,  1-1,  1-2,  1-3;  2-1,  2-2,  2-3. 

Research  reports  in  the  16-volume  set  are  organized  as  follows: 
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5B  Summer  Faculty  Research  Program  Reports:  Wright  Laboratory  (part  two) 

6  Summer  Faculty  Research  Program  Reports:  Arnold  Engineering  Development  Center;  Civil 
Engineering  Laboratory;  Frank  J.  Seiler  Research  Laboratory;  Wilford  Hall  Medical  Center 

7  Graduate  Student  Research  Program  Reports:  Armstrong  Laboratory 

8  Graduate  Student  Research  Program  Reports:  Phillips  Laboratory 
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Photonic  Transversal  Filtering  for  Microwave  Systems 


Charity  A.  Carter 
Graduate  Student 

Department  of  Electrical  Engineering 
Stevens  Institute  of  Technology 


Abstract 

Previous  papers  have  shown  that  it  is  possible  to  realize  optical  processors  that  can  be 
used  in  wideband  transmit  and  receive  systems  which  would  otherwise  be  restricted  to 
narrowband  use  [1,2,3]-  The  continuously  variable  time  delay  supplied  by  these  processors  makes 
them  ideal  for  use  in  systems,  such  as  phased  arrays,  which  employ  transversal  filtering.  This 
report  discusses  the  implementation  of  acousto-optic  based  and  fiber  optic  based  optical 
processors  in  these  systems.  In  addition,  non-uniform  sampling,  arising  from  the  availability  of 
a  continuously  variable  delay,  is  presented.  Finally,  it  is  shown  that  a  wavelet  transform  can  be 
used  to  perform  the  correlation  required  for  a  matched  filter  used  to  extract  information  from  a 
radar  signal. 
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Photonic  Transversal  Filtering  for  Microwave  Systems 


Charity  A.  Carter 


Introduction 

Most  beamforming  methods  used  for  phased  array  antenna  systems  are  limited  to 
narrowband  operation  due  to  a  lack  of  availability  of  a  continuously  variable  time  delay.  If 
wideband  operation  is  desired,  variable  delay  is  necessary  since  .without  it,  the  beamforming  error 
known  as  squint  will  result.  However,  by  making  use  of  photonic  methods,  continuously  variable 
time  delays  that  can  be  utilized  in  both  wideband  phased  array  transmit  and  receive  systems  have 
been  constructed  [1,3]  .  In  the  transmit  mode,  an  acousto-optic  cell  addressed  by  a  deformable 
mirror  device  is  used  in  the  construction  of  the  delay  line  system.  A  segmented  mirror  device  and 
a  fiber  optic  delay  line  array  are  utilized  in  the  receive  system. 

Transversal  filters,  which  are  used  in  many  signal  processing  filter  applications,  employ 
amplitude  weighting  and  time  delay  in  their  architecture.  Given  the  existence  of  continuously 
variable  time  delay,  it  would  be  possible  to  incorporate  this  into  a  system  to  yield  a  filter  capable 
of  continuous  reconfigurability  [2]  . 

Matched  filters  are  used  in  radar  systems  to  minimize  the  effects  of  noise  in  the  process 
of  signal  detection  [5].  These  filters  make  use  of  a  correlation  integral  in  order  to  distinguish 
signal  information  from  noise.  Due  to  the  availability  of  a  continuously  variable  time  delay,  a 
phased  array  radar  system  can  be  used  for  wideband  operation.  It  has  been  shown  that,  for  a 
wideband  signal,  the  continuous  time  wavelet  transform  becomes  a  "wideband  cross-ambiguity 
function".  Therefore  a  wavelet  transform  can  be  implemented  in  wideband  radar  systems  to  carry 
out  the  matched  filter  operation  [4]. 

This  report  focuses  on  the  implementation  of  continuously  variable  time  delay  for 
wideband  use  of  microwave  systems.  This  investigation  formed  the  basis  of  the  summer  research 
project  Further  research  will  be  carried  out  in  these  areas  especially  that  of  the  implementation 
and  optimization  of  the  variable  time  delay  in  systems  utilizing  transversal  filter  architectures. 
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Acousto-Optic  Based  Optical  Processor 

In  order  to  realize  a  continuously  variable  time  delay,  an  acousto-optic  cell  operating  in 
the  Bragg  regime  has  been  utilized  in  an  optical  heterodyne  configuration  [1].  The  heterodyne 
system  forms  the  basis  for  obtaining  phase  information  while  the  AO  cell  acts  as  a  frequency 
shifter  of  the  laser  light  source. 

The  process  of  optical  heterodyning  involves  splitting  the  output  of  a  laser  into  two  paths. 
In  one  of  the  paths,  the  beam  is  frequency  shifted  and  then  optically  phase  shifted.  The  other 
beam  provides  a  phase  reference  and  is  referred  to  as  the  local  oscillator  beam.  The  two  beams 
are  then  summed  and  input  to  a  photodetector  that  responds  to  the  time-average  intensity  of  the 
light  The  output  produced  contains  an  RF  frequency  component  equal  to  the  optical  beat 
frequency  and  an  RF  phase  shift  equal  angle  the  optical  phase  shift.  This  optical  frequency  shift 
varies  linearly  with  frequency  as  required  for  a  true  time  delay. 

Of  fundamental  importance  in  achieving  continuously  variable  time  delay,  is  the  use  of 
an  acousto-optic  cell.  The  AO  cell  provides  frequency  translation  and  spatial  dispersion  of  the 
optical  frequencies.  In  addition,  the  AO  cell  acts  as  the  energy  storage  device  necessary  for  any 
physical  delay.  The  acoustic  wave,  which  propagates  along  the  cell,  is  produced  by  a 
piezoelectric  transducer  whose  input  is  an  RF  signal.  The  RF  time  delay  introduced  is  determined 
by  the  time  required  for  the  acoustic  wave  to  propagate  a  portion  of  the  laser  beam  width  through 
the  AO  cell. 

Fiber  Based  Optical  Processor 

The  above  system,  using  a  single  optically  tapped  AO  cell,  functions  efficiently  in  the 
transmit  mode  of  operation  when  used  in  the  formation  of  an  electromagnetic  radiation  pattern 
for  phased  array  antennas.  However,  in  the  receive  mode,  the  non-reciprocal  character  of  acousto¬ 
optic  modulators  would  require  the  use  of  one  AO  cell  per  antenna  element.  If  weight  and  power 
constraints  exist,  this  approach  would  be  impractical.  It  has  been  shown  that  an  optical  processor 
for  the  control  of  broadband  phased  array  receive  systems  can  be  constructed  based  on  the  use 
of  a  spatial  light  modulator  known  as  a  segmented  mirror  device  (SMD)  [3].  This  device  consists 
of  an  NxN  array  of  controllable  mirror  elements  which  serve  to  steer  light  into  an  array  of  fiber 
delay  lines.  Each  fiber  in  the  array  is  of  a  different  length  ,  therefore  the  light  traveling  through 
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one  fiber  experiences  a  delay  different  from  light  passing  through  any  other  fiber.  The  output  of 
these  fibers  can  then  be  summed  and  fed  into  a  detector  to  yield  a  system  with  variable  delay. 
This  receive  configuration,  making  use  of  a  SMD,  is  well  suited  for  applications  involving  large 
arrays.  Due  to  its  wide  electrical  bandwidth  characteristic,  it  overcomes  the  dispersive 
beampointing  error  known  as  squint,  which  is  associated  with  narrowband  systems. 

Transversal  Filtering 

Transversal  filtering,  which  makes  use  of  tapped-delay  lines,  is  a  powerful  signal 
processing  technique.  In  general,  the  output  of  an  adaptive  (reconfigurable)  transversal  filter 
consists  of  the  sum  of  weighted  and  delayed  versions  of  an  input  signal  [6],  Therefore,  for  an  N 
element  transversal  filter,  the  equation  for  the  output,  y(t),  is  given  by: 


/-i 


(i) 


where  x(t)  represents  the  input  signal,  Tj  represents  the  time  delays  and  a,  represents  the 
amplitude  weight  values. 

In  a  typical  adaptive  filter,  the  output  is  compared  to  the  desired  response  signal.  An  error 
exists  if  there  is  any  difference  between  the  two  signals.  However,  since  the  filter  is  adaptable, 
the  weights  a*  can  be  adjusted  to  reduce  the  error  present  in  y(t).  In  most  cases,  the  optimization 
of  the  weights  is  carried  out  in  a  manner  resulting  in  the  minimization  of  the  mean  square  value 
or  average  power  of  the  error  signal.  The  mean  square  error  is  a  quadratic  function  of  the 
amplitude  values  a^  which  are  often  collectively  referred  to  as  the  weight  vector.  A  plot  of  the 
mean  square  error  versus  the  amplitude  weight  values  gives  a  "  bowl-shaped"  surface  commonly 
termed  the  performance  surface.  By  finding  the  minimum  of  the  performance  surface,  which  is 
typically  done  by  gradient  methods,  the  optimal  weight  values  can  be  determined.  In  traditional 
adaptive  filters,  the  vaiues  of  T,  are  uniform  and  do  not  aid  in  the  adaptation  process  [6]. 

The  previous  adaptive  transversal  filter  configuration  contains  only  one  degree  of  freedom, 
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namely  the  a,  values.  However,  when  a  continuously  variable  delay  line  is  employed  to  construct 
the  delay  elements,  it  becomes  possible  to  design  a  system  in  which  the  individual  delays  may 
be  set  for  different  values.  Therefore,  a  second  degree  of  freedom  exists.  The  adaptation  process 
can  then  be  carried  out  using  the  a;  and  Tj  values  which  may  be  adjusted  to  aid  in  the 
optimization  of  the  output  and  reduce  the  error  present.  In  order  to  efficiently  make  use  of  this 
capability,  it  would  be  necessary  to  develop  an  algorithm  that  will  carry  out  the  optimization  of 
the  adjustable  parameters.This  topic  will  be  the  subject  of  further  research. 

The  availability  of  variable  time  delay  can  be  utilized  in  systems  using  non-uniform 
sampling.  This  will  be  discussed  below. 


Impulse  Response  &  Frequency  Response  Synthesis 

The  impulse  response  and  frequency  response  of  a  transversal  filter 
x(t)  can  be  found  by  taking  the  Fourier  transform  of  the  filter  output  as  given 
This  yields 


with  a  single  input 
by  equation  (1)  [6]. 


N 

Vtu)=£apxp(-y&>7}A{w) 


M 


(2) 
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By  comparing  equation  (2)  to  the  frequency  domain  version  of  the  convolution  property  given 
by 


(3) 

it  can  be  seen  that  the  frequency  response  (transfer  function)  of  a  transversal  filter  is 

N 

^o»)=£a£xp(-./u7}  (4) 

M 

It  follows  that,  by  taking  the  inverse  Fourier  transform  of  the  frequency  response,  the  filter 
impulse  response  is 


ts) 

M 

where  5(t)  is  the  Dirac  delta  function. 

Non-Uniform  Sampling 

According  to  the  sampling  theorem,  it  is  possible  to  reconstruct  a  signal  from  its  samples 
provided  that  the  function  is  bandlimited  and  that  the  sampling  frequency  is  greater  than  twice 
the  highest  frequency  of  the  signal  [7],  This  theorem  was  developed  for  the  case  when  x(t)  is 
sampled  by  an  impulse  train  at  intervals  spaced  by  T  to  produce  a  sampled  function  of  the  form 


xinW-nT)  (6) 
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In  the  equation  for  x,(t),  the  signal  x(t)  is  sampled  at  uniform  intervals,  that  is  at  integer  multiples 
of  T. 

Given  that  it  is  possible  to  construct  continuously  variable  delay  lines,  a  signal  can  be 
sampled  at  non-uniform  intervals.  It  has  been  shown  that,  for  the  case  of  non-uniform  sampling, 
a  result  similar  to  that  of  the  sampling  theorem  can  be  obtained  [8].  It  was  concluded  that  the 
sampled  waveform  vt(t)  can  be  expressed  as  a  sum  of  exponentials  multiplied  by  the  original 
signal  v(t)  and  an  amplitude  factor.  The  first  term  in  the  exponential  expansion  has  the  same 
form  as  the  original  waveform  except  for  an  amplitude  distortion.  If  the  frequency  band  occupied 
by  this  term  does  not  overlap  the  frequency  bands  of  the  higher  order  terms,  low-pass  filtering 
will  separate  it  from  those  terms.  In  this  manner,  the  original  waveform  can  be  recovered  even 
though  non-uniform  sampling  was  used. 

In  addition,  non-uniform  sampling  has  found  extensive  use  in  bandpass  filtering  [8].  It  has 
been  shown  that  it  is  not  possible  to  produce  an  asymmetric  passband  centered  at  frequency  f0 
from  a  single  filter  containing  samples  uniformly  spaced  at  intervals  of  l/2f0  [9], 

Phased  Array  Antenna  Analogy 

In  a  phased  array  antenna  system,  the  desired  radiation  pattern  is  obtained  by  controlling 
the  relative  amplitude  and  phase  of  the  signals  applied  to  the  individual  radiating  elements  [5]. 
The  combined  effect  of  all  the  elements  determines  the  shape  and  direction  of  the  far  field 
electromagnetic  beam  that  has  been  formed.  By  inserting  an  appropriate  phase  shift  at  each 
element  of  the  array,  the  main  beam  of  the  antenna  pattern  may  be  made  to  point  in  a  given 
direction.  One  method  of  introducing  a  phase  shift  involves  delaying  the  signal  sent  to  each 
element  of  the  array  as  the  signal  propagates  across  the  face  of  the  array.  A  relative  delay,  and 
therefore  phase  shift,  will  exist  between  the  array  elements.  Phase  shift  can  be  introduced  by 
utilizing  devices  such  as  diode  and  ferrite  phase  shifters.  However,  these  do  not  provide 
sufficient  bandwidth  if  wideband  operation  is  required. 

In  a  traditional  phased  array  system,  a  continuously  variable  delay  is  not  available.  This 
has  restricted  the  use  of  phased  array  antenna  systems  to  narrowband  applications  due  to  the 
presence  of  squint  (1].  Implementing  the  continuously  variable  delay  line  previously  described, 
allows  the  utilization  of  phased  array  systems  for  wideband  applications. 
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If  a  phased  array  system  is  used  in  the  receive  mode,  there  exists  a  similarity  to  a 
transversal  filter  system  [3].  When  information  from  a  far  field  point  source  is  received  by  a 
phased  array  system,  each  antenna  element  will  receive  a  delayed  version  of  that  signal.  For  the 
phased  array  antenna,  varying  the  delay  enables  steering  the  beam  in  different  directions. 
Therefore,  in  a  phased  array  receive  system,  varying  the  delays  allows  information  from  different 
directions  to  be  detected. 


Matched  Filter 

In  order  to  determine  the  range  and  velocity  of  a  target,  a  radar  system  first  emits  a 
known  signal.  After  being  reflected  by  the  target.the  received  signal  contains  the  desired 
information.  However,  noise  is  also  received  along  with  the  signal.  A  network  whose  frequency- 
response  maximizes  the  output  peak-s«gnal-to-noise  ratio  is  required  to  detect  the  signal  in  the 
presence  of  noise.  One  such  network  is  the  matched  filter  and  it  is  used  in  the  design  of  most 
radar  receivers  [5]. 
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The  frequency  response  of  a  matched  filter  is 


/*ca)=ra*(u)axp(-M)  {7) 

where  S*(<o)  is  the  complex  conjugate  of  the  Fourier  transform  of  the  received  signal  s(t),  k  is 
a  constant  equal  to  the  maximum  filter  gain(generally  taken  to  be  one),  and  t,  is  a  fixed  value  of 
time  at  which  the  signal  is  observed  to  be  a  maximum.  It  can  be  seen  that  the  amplitude  spectrum 
of  the  matched  filter  is  the  same  as  that  of  the  signal  and  the  phase  spectrum  of  the  matched  filter 
is  the  negative  of  that  of  the  signal  plus  a  phase  shift  proportional  to  frequency.  By  taking  the 
inverse  transform  of  the  frequency  response,  the  impulse  response  can  be  seen  to  be 

(8) 

The  output  of  a  matched  filter  is  not  an  exact  replica  of  the  input  signal.  However,  the 
output  is  proportional  to  the  input  signal  cross-correlated  with  a  replica  of  the  transmitted  signal, 
except  for  the  time  delay  t,.  The  cross-correlation  of  two  signals  is  defined  as 

<9> 

—m 

If  the  input  to  the  matched  filter  is  y;(t)  =  s(t)  +  n(t),  where  n(t)  represents  noise,  the  output  will 
be 

m 

<10> 

The  equation  above  shows  that  the  matched  filter  forms  the  cross-correlation  between  the 
received  signal  and  a  replica  of  the  transmitted  signal. 

The  value  of  the  correlation  integral  will  be  greatest  when  the  two  signals  being  correlated 
are  the  same.  By  finding  the  value  of  t,  where  the  greatest  value  of  the  correlation  exists,  the 
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signal  that  was  reflected  by  the  target  can  be  differentiated  from  the  noise.  This  procedure  enables 
the  range  and  velocity  information  to  be  extracted. 

Tfas_WaY.glg.t  Transform 

The  continuous  time  wavelet  transform  (CWT)  [4]  produces  a  time-scale  representation 
of  a  time  function  x(t)  and  is  defined  by 


CWT£x,a)=—  f  x(l) /»*(—)<#  ( 11 ) 

/[a|  —  a 

The  basis  function  h(t)  is  called  a  wavelet.  These  functions  are  obtained  by  scaling  and  shifting 
a  prototype  wavelet.  The  prototype  wavelet  h(t)  is  a  real  or  complex  bandpass  function. 

The  CWT  can  be  written  as  an  inner  product  of  the  form 

cw7>  /  (12) 

In  this  case,  the  above  integral  measures  the  similarity  between  the  signal  and  the  wavelets  given 
by 


va-W?)  (is) 

v/a  a 

As  previously  stated,  a  matched  filter  makes  use  of  a  correlation  integral  in  order  to 
determine  the  location  of  velocity  and  range  information.  This  integral  is  an  inner  product  similar 
to  that  for  the  CWT.  Therefore,  the  wavelet  transform  can  be  used  to  perform  the  correlation 
necessary  for  the  detection  of  information  in  a  radar  system. 

The  detection  procedure  involves  emitting  a  known  signal  h(t)  which  is  then  received  with 
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a  delay,"  x",  and  a  distortion  or  scaling,"  a”.  The  delay  gives  information  about  the  target's 
distance  and  information  about  the  velocity  (Doppler  shift)  is  obtained  from  "  a".  For  a  wide-band 
signal,  where  the  Doppler  shift  is  not  confined  to  a  single  frequency  .the  CWT  is  given  by 

m 

Civrjr(x,a)=—  f  X(l)/K— )dt  ( 14 ) 

* 

where  x(t)  represents  the  received  signal.  The  CWT  functions  as  a  maximum  likelihood  estimator 
and  the  wavelet  which  best  fits  the  signal  is  used  to  estimate  the  parameters  "  t"  and  "  a". 

Example 

To  illustrate  the  ideas  presented  in  this  report,  a  received  signal  with  "x"=3  and  "a"=2  was 
probed  by  a  filter  bank.  It  was  desired  to  show  that  the  CWT  would  reach  its  maximum  at  those 
values  of  "x"  and  "a".  Rather  than  use  the  time  domain  version  of  the  CWT  as  given  by  equation 
(11),  the  frequency  domain  version  of  the  CWT  was  used.  This  equation  is  given  by 


CWT^x,a)= — - —  f  G(«)//*(o)a)ex p(^)<fc>  ( 15 ) 

2n/[a|  r.  a 

The  received  signal,  G(co),  is  represented  by  a  rectangle  function  centered  at  co  *  200  with  a 
bandwidth  of  40.  The  frequency  response  of  the  filter  bank,  H(w),  is  given  by  a  rectangle 
function  centered  at  to  =  100  with  a  bandwidth  of  20.  Upon  analysis  of  equation  (15),  the 
maximum  value  of  the  CWT  was  determined  to  be  18.006,  located  at  "x"-3  and  "a"=2.  This 
agrees  with  the  value  predicted  for  "x"=3  and  "a"-~2. 
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Conclusions 

Throughout  this  report,  the  wideband  characteristic  of  phased  array  systems  employing 
a  continuously  variable  time  delay  has  been  stressed.  By  enabling  these  systems  to  operate  in  this 
manner,  they  are  no  longer  restricted  to  narrowband  use  since  the  beamforming  error  known  as 
squint  may  be  eliminated.  Prior  to  the  development  of  the  acousto-optic  and  fiber  optic  based 
optical  processors  discussed  here,  other  methods  for  generating  variable  time  delay  had  been  lossy 
and  impractical. 

The  phased  array  systems  discussed  make  use  of  the  signal  processing  technique  of 
transversal  filtering.  The  strength  of  transversal  filters  lies  in  their  ability  to  be  reconfigurable 
thereby  enabling  them  to  function  in  an  environment  that  is  not  completely  characterized  or  is 
changing.  Typically,  the  adaptability  of  a  transversal  filter  lies  in  the  optimization  of  the  weight 
vector.  However,  the  availability  of  continuously  variable  time  delay  introduces  another  set  of 
parameters  which  may  be  optimized  to  improve  system  performance.  Future  research  should 
include  determining  the  influence  of  these  delay  values  and  the  manner  in  which  they  may  be 
adjusted  for  decreased  error. 

It  has  been  shown  that,  for  wideband  systems,  a  wavelet  transform  can  perform  the  cross- 
correlation  operation  of  a  matched  filter  in  a  receive  system.  Due  to  the  increased  importance  of 
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wavelet  theory  in  the  area  of  signal  processing^  would  be  advantageous  to  incorporate  wavelet 
methods  in  future  transversal  filter  architectures  for  wideband  phased  array  systems. 
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IMPLEMENTATION  OF  THE  ITT  MULTIPLE  PARAMETER 
SPEAKER  RECOGNITION  ALGORITHM  ON  THE  SUN  SPARC 

Robert  L.  Gorsegner 
Graduate  Student 

Department  of  Electrical  Engineering 
University  of  Alabama 

Abstract 

I  intend  to  research  how  speaker  recognition  systems  work  by  implementing  the  ITT  multiple 
parameter  speaker  recognition  algorithm  in  ‘C’  on  the  Sun  Sparc  workstation.  I  will  test  the 
accuracy  by  using  the  KING  database.  I  also  intend  to  change  the  algorithm  in  different  ways  to 
see  how  accuracy  is  effected. 
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IMPLEMENTATION  OF  THE  ITT  MULTIPLE  PARAMETER 
SPEAKER  RECOGNITION  ALGORITHM  ON  THE  SUN  SPARC 

Robert  L.  Gorsegner 

1  Introduction 

Speaker  recognition  has  many  possible  practical  uses.  Some  of  these  include: 

•  Having  computers  follow  who  is  speaking  in  a  conversation, 

•  To  verify  users  of  telephone  banking, 

•  For  use  in  speech  recognition  (The  ability  of  a  computer  to  decode  what  the  speaker  is  saying) 
by  applying  different  models  to  different  speakers  the  computer  will  be  better  able  to  identify 
words, 

•  and  many  others. 

In  general  speaker  recognition  works  in  the  following  way.  Speaker  recognition  algorithms 
convert  the  time  domain  signal  into  the  frequency  domain.  Then  some  type  of  average  of  the 
frequency  domain  is  determined  for  each  speaker  the  algorithm  is  trained  on.  A  speaker  is  then 
tested  for  every  speaker  in  the  trained  set  an  some  kind  of  distance  between  the  two  frequency 
domains  is  calculated.  The  closest  distance  should  give  the  correct  speaker. 

2  Algorithm  Description 

The  multiple  parameter  algorithm  works  in  the  following  way.  A  frame  of  data  is  read  in.  The 

data  is  the  filtered  by  a  16  order  butterworth  band  pass  IIR  filter.  This  filter  has  cut  off  frequencies 

of  350Hz  and  2800Hz.  This  filter  is  used  to  limit  the  frequency  coming  into  the  speaker  recognition 

system.  I  have  found  that  this  filter  increases  the  accuracy  by  ten  percent  for  ten  speakers  and  ten 

seconds.  A  speech/nonspeech  detection  algorithm  is  used  next  to  determine  if  the  frame  is  speech 

or  just  noise.  In  my  algorithm  I  used  the  energy  per  frame  as  a  silence  speech  detection.  The 
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data  is  then  windowed  by  a  Hamming  window  as  used  by  ITT  [4].  Then  the  LPC  and  reflection 
coefficients  are  calculated.  I  found  that  ten  coefficients  is  a  good  number  to  use,  represented  by  the 
constant  (M ).  ITT  also  used  this  number  of  Coefficients  [4].  Linear  Predictive  Coding  Coefficients 
are  calculated  by  Levinson  or  Durbin  recursion.  The  LPC  coefficients  (shown  by  the  variable  A) 
predict  the  next  coefficients  of  data  by  the  formula  below.  The  variable  n  is  the  current  sample 
data  number. 


Yn+ 1  =  YnA0  4-  Yn_iA\  4-  Yn_ 2A2  4  ...  4-  YaAn~M  (1) 

The  variable  (Y)  denotes  the  value  of  each  point  in  a  frame.  The  predictor  error  (pe)  is  the 
difference,  totaled  between  the  actual  value  of  and  the  calculated  value  of  Yn+ 1 .  It  is  also 
the  auto  correlation  function  of  the  frame.  The  LPC  cepstrum  (C)  is  calculated  from  the  LPC 
coefficients  by  the  following. 


Co  =  logl0(pe)  (2) 

Let  m  go  from  1  to  M. 

si  _  a  (m  ~  Y)Cm-\A\  4-  (m  —  2)Cm_2A2  +  •••  4-  C\ Am-i 

m 

The  cepstrum  represents  the  spectrum  of  the  spectrum.  The  LPC  cepstrum  is  used  instead  of 
.the  FFT  cepstrum  since  it  gives  a  smoother  frequency  response  and  requires  less  computation.  The 
FFT  cepstrum  is 


FFTcepstrum  =  \FFT(logio[\FFT(frame)\))\ 


(4) 


After  the  LPC  cepstral  coefficients  are  calculated  for  the  frame,  they  are  summed  to  a  running 
total  of  all  the  frames.  Each  cepstral  coefficient  in  a  frame  is  multiplied  by  every  other  cepstral 
coefficient  in  that  frame.  These  values  are  also  kept  in  a  running  total  producing  an  M  by  M 
matrix.  By  using  this  method  it  is  not  necessary  to  keep  track  of  the  cepstral  coefficients  for  each 
frame.  After  all  the  frames  of  speech  have  passed  through,  the  sums  are  divided  by  the  number  of 
frames  producing  an  average  cepstral  vector  (X).  The  M  by  M  matrix  also  is  divided  by  the  number 
of  frames.  The  matrix  is  then  subtracted  by  each  term  of  the  average  cepstral  vector  multiplied  by 
every  other.  This  produces  the  covariance  matrix(R)  which  is  shown  in  equations  5  and  6.  The 
variable  i  represent  the  frame  number  and  j,  k  represent  coefficient  numbers.  N  is  equal  to  the 
total  number  of  frames. 


<»> 

«=i 

=  (jf'LCi&j,)  -  XiXt  (6) 

tssl 

The  covariance  matrix  and  average  cepstral  vector  are  calculated  and  saved  for  each  speaker.  To 
recognize  an  unknown  speaker,  the  average  vector  must  be  compared  against  the  trained  speaker’s 
average  vector  and  covariance  matrix.  This  is  done  by  computing  the  Mahalanobis  distance. 

D  =  (X-M)tR"1(X-M)  (7) 

where, 

•  D  is  the  Mahalanobis  distance, 

•  X  is  the  average  input  vector, 

•  M  is  the  average  training  vector  for  the  speaker  model, 

•  R  is  the  covariance  matrix  for  speaker  model. 

The  lowest  distance  should  be  the  correct  speaker. 
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Figure  1:  Flow  of  Algorithm 

3  Speech  Silence  Detection  Algorithm 

I  also  included  a  subroutine  in  my  program  using  the  speech/non speech  detection  method  de¬ 
scribed  in  “Monthly  Status  Report  for  the  Project:  A  Modulated  Model  for  Speaker  Identification” 
[2].  The  Speech  silence  algorithm  goes  as  follows.  The  first  five  frames  are  considered  to  be  silence 
so  that  a  statistical  analysis  on  the  background  noise  can  be  performed.  The  following  variables 
are  calculated: 

(Energy  Threshold)  T\  =  ave  +  aiste  oi  >  0 

(Zero  Crossing  Rate)  T2  —  avz  +  a2jfz  a2  >  0 

(Magnitude  Threshold)  T3  =  avm  4-  a3stm  <23  >  0 

where, 

•  ave  :  average  energy  per  frame 

•  avz  :  average  number  of  zero  crossings  per  frame 

•  avm  :  average  magnitude  per  frame 

•  ste  :  standard  deviation  of  energy 

•  stz  :  standard  deviation  of  zero  crossing  rate  1 

•  stm  :  standard  deviation  of  magnitude 

According  to  the  designers  of  the  algorithm  a  good  choice  of  ati,  a2,  and  03  is  three  which  is 
the  value  I  used  in  my  program. 
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After  the  speech/silence  statistics  are  calculated  the  rest  of  the  program  data  is  tested. 

If  ( E  >=  Tx)  or  ( Z  >~  T2)  or  (M  >=  T$)  then  it  is  classified  as  speech.  Where, 

•  E  :  Energy  of  the  frame 

•  Z  :  Number  of  zero  crossings 

•  M  :  Average  magnitude  of  the  frame 

I  did  not  use  this  method  since  in  the  KING  database[l]  the  first  few  frames  sometimes  were 
abnormally  silent  and  other  times  had  voice  in  them.  This  caused  the  algorithm  to  not  work. 
Instead  I  preset  a  value  for  T\  —  2500 (framesize).  I  did  not  use  Tj,  or  T3  to  test  for  speech.  The 
code  that  calculates  the  values  of  Tj  and  T3  is  commented  out. 

4  Program  Usage 

To  train  the  speaker  identification  program  for  a  specific  speaker  use  the  format  in  figure  2. 
This  creates  or  appends  to  a  parameter  file.  The  name  of  the  parameter  file  haw  the  window  length 
and  type  appended  to  it.  The  types  of  windows  available  are  (0)  for  rectangular, (1)  Gaussian, 
(2)  Hamming,  and  (3)  for  raised  cosine.  The  program  will  train  on  the  utterance  until  either  the 
specified  number  of  frames  has  been  reach  or  the  end  of  the  file.  If  it  hit  the  end  of  file  prematurely 
no  warning  will  be  given. 

To  test  an  utterance  the  user  changes  the  ‘Train/Test’  flag  to  1.  The  program  will  show  the 
distance  between  each  speaker  and  the  winning  distance. 

5  Program  Description 

The  Speaker  Recognition  system  has  a  training  mode  and  a  testing  mode.  In  the  training  mode 
LPC  cepstral  coefficients  are  generated  for  the  frames  of  speech  data  and  stored  in  a  parameter 
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+ - - - Points  in  the  window 

I  + - Order  of  Predictor 

I  |  + - Filename  of  16  bit  speech 

I  |  |  + - Window  Type 

II  I  I  + - Start  of  utterance  in  frames 

II  I  II  + - Length  of  speech  to  test  on  in  frames 

II  |  III  +-Train (0) /Test (1)  flag 

I  I  I  I  I  I  1 

V  V  V  V  V  V  V 

sp_id  256  10  file  name  2  0  12000  0 


Figure  2:  Shows  format  to  call  speaker  identification,  program. 

file.  The  testing  mode  generates  coefficients  for  the  unknown  speaker  and  compares  them  to  the 
coefficients  in  the  parameter  file.  It  then  reports  the  results  to  the  user. 

The  ITT  speaker  identification  system  is  coded  in  ‘C’  shown  by  the  flowchart  in  figure  3  [4].  The 
first  block  allocates  memory  for  the  covariance  matrix,  average  vector,  and  window  values.  Next 
the  speaker  identification  program  gets  the  arguments  from  the  command  line.  The  arguments 
include  the  test  speaker  file,  window  or  frame  size,  window  type,  order  of  predictor,  starting  point, 
and  length  of  speech  data  to  train  on.  The  window  types  that  are  available  in  the  program  are 
rectangular,  Gaussian,  Hamming,  and  raised  cosine.  For  accuracy  the  same  window  must  be  used 
for  training  and  .testing.  The  window  is  then  calculated  and  stored  in  memory.  This  is  done  so 
that  the  window  does  not  have  to  be  calculated  for  each  frame. 

The  Calculate  Cepstral  Covariance  Matrix  and  Average  vector  block  of  code  (figure  4)reads  a  16- 
bit  speech  data  file  and  calculates  the  average  and  the  covariance  matrix  of  the  cepstral  coefficients. 
The  first  step  is  to  read  a  frame  of  speech.  The  frame  is  then  filtered  by  using  a  subband  filter 
and  windowed.  If  the  energy  of  the  frame  is  not  above  a  certain  value  a  new  frame  is  read  in.  The 
linear  predictive  coding(LPC)  coefficients  and  the  reflection  coefficients  are  calculated  by  using  the 
subroutine  found  in  “Voice  and  Speech  Processing"  by  T.  Parsons.  Some  of  the  variables  in  this 
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Figure  3:  Main  flow  of  the  program1. 
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subroutine  are  required  to  be  double  precision  floating  point  numbers  (8  bytes)  instead  of  the  single 
precision  floats.  The  LPC  coefficients  are  then  passed  to  a  subroutine  found  in  “Linear  Prediction 
of  Speech” [3]  to  calculate  the  cepstral  coefficients.  The  zeroth  cepstral  coefficient  is  calculated  but 
not  used.  The  cepstral  coefficients  are  totaled.  Each  cepstral  coefficient  is  multiplied  by  every 
other  cepstral  coefficients  for  a  certain  frame  and  a  total  is  kept  in  the  covariance  matrix.  This 
process  is  done  until  the  end  of  the  speech  data  file  or  the  set  number  of  frames  has  been  processed. 
At  the  end,  the  cepstral  coefficients  are  divided  by  the  number  of  frames  to  produce  an  average 
of  the  cepstral  coefficients.  The  covariance  matrix  is  also  divided  by  the  number  of  elements  and 
then  subtracted  by  the  product  of  the  average  cepstral  coefficients.  See  the  Covariance  formula  for 
better  explanation. 

When  training  the  speaker  identification  system  writes  or  appends  if  the  parameter  file  already 
exists  the  speaker  name,  average  vector  and  covariance  matrix  in  ASCII.  This  can  be  viewed  by 
use  of  the  ‘more’  or  ‘pg’  UNIX  commands. 

During  testing  the  parameters  for  each  trained  speaker  are  read.  The  average  vector  of  the 
unknown  and  each  trained  speaker  are  subtracted.  This  value  is  multiplied  by  the  inverse  of  each 
trained  speakers’  covariance  matrix.  Then  it  is  multiplied  by  the  transverse  of  the  difference  of  the 
average  vectors.  This  calculation  results  in  a  scalar  value  called  the  Mahalanobis  distance.  This 
process  is  done  for  all  trained  speakers.  The  lowest  of  these  distances  should  match  the  correct 
speaker. 

6  Test  Results 

I  used  the  KING  database  for  testing.  The  KING  database  has  fifty  speakers.  Each  of  these 
speakers  spoke  in  ten  sessions  which  are  about  forty  seconds  each.[l).  Each  session  is  recorded  wide 
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Table  1:  Confusion  matrix  showing  performance  of  speaker  identification  algorithm  using  ten  speak¬ 
ers  and  ten  second  utterances. 


band  through  a  narrow  band  channel.  It  was  recorded  at  both  ends  of  a  telephone  line.  In  my 
tests  I  have  only  used  the  wide  band  speech.  In  my  first  test  I  used  sessions  one  through  three  of 
the  wide  band  to  train  the  speaker  identification  system.  I  tested  on  six  ten-seconds  intervals  of 
sessions  four  and  five.  I  used  ten  cepstral  coefficients  and  a  16  order  butterworth  bandpass  filter 
with  cutoff  frequencies  of  350Hz  and  2800Hz.  I  generated  a  confusion  matrix  (Table  1)  to  show 
the  results.  To  generate  the  confusion  matrix  I  modified  my  ‘C’  program  to  do  batch  rims.  The 
accuracy  over  these  runs  was  68  percent.  I  also  kept  track  of  the  average  distance  (Table  2)  for  each 
utterance.  Testing  on  thirty  two  different  two  second  intervals  is  shown  in  Table  3.  An  accuracy 
of  53  percent  was  obtained. 

An  error  that  I  had  in  my  program  which  is  now  fixed  showed  something  interesting.  I  accidently 
'used  a  window  that  was  normal  for  one  half  of  the  frame  and  inverted  for  the  other  half.  I  found 
only  a  18  percent  reduction  in  correct  identification  for  this  weird  window.  The  results  for  using 
ten  speakers,  and  ten  seconds  are  found  in  table  4.  The  accuracy  of  this  was  50  percent. 
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1.174 

1.013 

1.221 

0.624 

5.465 

2 

3.948 

0.307 

3.279 

2.860 

2.887 

2.803 

4.983 

5.067 

2.844 

4.762 

3 

3.438 

4.630 

0.435 

1.427 

1.042 

1.094 

1.807 

1.326 

1.072 

4.701 

4 

7.063 

5.476 

1.501 

0.846 

3.723 

1.994 

0.756 

1.480 

1.231 

7.918 

5 

4.032 

3.236 

1.588 

2.285 

0.278 

0.716 

3.767 

3.090 

1.228 

5.454 

6 

3.583 

3.200 

1.061 

1.523 

0.962 

0.426 

1.939 

2.281 

0.447 

4.234 

7 

5.727 

5.970 

1.909 

1.197 

5.262 

3.451 

0.299 

1.582 

3.448 

5.038 

8 

4.385 

5.585 

0.981 

1.099 

1.647 

0.817 

2.154 

0.700 

0.829 

5.657 

9 

4.134 

3.874 

1.055 

0.952 

1.236 

0.695 

1.518 

2.112 

0.376 

5.238 

10 

4.826 

1.512 

1.404 

1.543 

1.661 

0.889 

1.874 

3.030 

0.601 

5.324 

Table  2:  Confusion  matrix  showing  average  Mahalanobis  distances  using  ten  speakers  and  ten 
second  utterances. 


Known 

Known 

Speaker 

1 

2 

3 

Identified  Speaker 
Identified  Speaker 

4  5  6  7 

8 

9 

10 

1 

16 

0 

2 

5 

0 

0 

2 

1 

2 

1 

2 

11 

27 

0 

0 

0 

0 

0 

0 

0 

9 

3 

2 

0 

18 

0 

7 

2 

0 

6 

0 

4 

4 

0 

0 

3 

10 

0 

0 

3 

4 

0 

2 

5 

0 

2 

0 

0 

25 

9 

0 

0 

3 

0 

6 

3 

0 

2 

2 

0 

14 

0 

0 

13 

2 

7 

0 

0 

0 

6 

0 

0 

27 

2 

0 

5 

8 

0 

0 

6 

2 

0 

5 

0 

19 

0 

4 

9 

0 

0 

0 

6 

0 

2 

0 

0 

12 

2 

10 

l_L_ 

3 

1 

1 

0 

0 

0 

0 

2 

3 

Table  3:  Confusion  matrix  showing  performance  of  speaker  identification  algorithm  using  ten  speak¬ 
ers  and  two  second  utterances. 
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Known 

Speaker 

1 

2 

Identified  Speaker 

3  4  5  6  7  8 

9 

10 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

6 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

6 

0 

0 

0 

0 

3 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

5 

0 

0 

0 

0 

6 

3 

0 

0 

0 

0 

6 

0 

0 

0 

0 

0 

3 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

3 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

6 

0 

0 

6 

0 

0 

3 

3 

6 

6 

10 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

Table  4:  Confusion  matrix  showing  performance  of  speaker  identification  algorithm  using  ten  speak¬ 
ers  and  ten  second  utterances  using  weird  window. 

7  Future  Improvements 


I  intend  to  continue  my  work  on  speaker  recognition  as  part  of  my  masters  thesis  at  the  Uni¬ 
versity  of  Alabama.  I  intend  to  try  several  things  to  improve  performance.  These  include  using 
reflection  coefficients,  using  FFT-cepstral  coefficients,  varying  the  order  of  predictor,  trying  dif¬ 
ferent  front  end  filters  and  using  a  clustering  method  such  a  K-means.  I  have  written  a  program 
using  K-means  codebooks  but  I  have  not  debugged  it  yet.  I  also  intend  to  look  for  possible  ways  of 
speeding  up  the  code,  for  instance  changing  floating  point  numbers  to  integers,  or  implementation 
on  a  dedicated  DSP  board.  I  also  made  a  X- window  interface  for  the  program.  Currently  I  do  not 
have  all  the  options  of  the  windowed  version  working.  It  is  designed  using  a  interface  design  tool 
called  Guide  [5].  This  program  shows  the  following  window  with  all  the  options  as  shown.  The 
window  interface  version  is  very  user  friendly. 
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Speaker  Directory:  /r«o_mnt/wbr/il 


Speaker  file:  I.tl.sl.wbr 


parameter  Directory:  /honne/roma/rocr5»on/ouida/iti2 

foromotor f llo:  pare  moars. 236.10.2 

Naabirgl  hrtmtm:  1® _ *W 

Window  Sire:  256 _ *j*J 

Window  Typo:  '' 


iKvmtr- 1  uqssacr.' 


Ovarian  (Porto  ntfc  50 
Ceefficant  Typo: 


±13 


|  tairvifir 


Algorithm  Typo: _ 

Codebook  Algorithm  Only 
Codobook  Sire:  0  * 1*1 

Number  of  Training  Iteration:  0 
Sampling  koto:  8000  *M 

Starting  positioa  (Sot}:  0 _ *1*1 

Length  (Sot):  10 _ *  M 


±13 


the  n  distance  Iron  i.tl.sl.rbr 
and  the  A  distance  is  9.088601. 

The  K  distance  froe  i.tl.sl.wbr 
and  the  A  distance  Is  0.603374. 

The  K  distance  froe  l.tl.sl.tbr 
and  the  A  distance  Is  0.4113S4. 

The  M  distance  froe  l.tl.sl.tbr 
and  the  A  distance  is  0.218803. 

The  n  distance  froa  l.tl.sl.abr 
and  the  A  distance  Is  0.303680. 

The  H  distance  froa  I.tl.sl.wbr 
and  the  A  distance  Is  0.323871. 

The  n  distance  froe  i.tl.sl.ebr 
and  the  A  distance  Is  0.223152. 

H  distance  froa  I.tl.sl.ebr 
the  R  distance  Is  0.137243. 

>  H  distance  froe  i.tl.sl.ebr 
1  the  R  distance  Is  0.368280. 


to  speakerl  Is  0.033662 
to  speaker  Is  3.408317 
to  speaker3  is  1.038152 
to  speakere  Is  0.68SQ18 
to  speakers  Is  2.705726 
to  speakers  Is  1.123002 
to  speaker!  is  1.065432 
to  speakers  Is  0.763325 
to  speakerio  Is  1.164013 


closest  distance  froe  i.tl.sl.ebr  to  spaNterl  Is 
0.035662 


(rn.fSPa? 


Winning  Speaker:  speakerl. 


Figure  5:  Picture  of  window  interface. 
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MATHEMATICAL  DESCRIPTION ,  COMPUTER  SIMULATION  AND  ANALYSIS 
OF  A  POINTING,  ACQUISITION  AND  TRACKING  8YSTEM 
FOR  OPTICAL  INTERSATELLITE  CROSSLINKS 

Carl  R.  Herman 
Graduate  Student 

Department  of  Electrical  Engineering 
Binghamton  University 

Attract 

The  mathematical  model  of  a  pointing,  acquisition  and  tracking 
(PAT)  optical  intersatellite  crosslink  system  is  developed.  It 
describes  a  laboratory  prototype,  created  to  demonstrate  various 
aspects  of  rapid-retargeting  bi-directional  laser  communications 
between  independent  space-based  stations.  The  model,  obtained  by 
the  detailed  analysis  of  system  hardware,  represents  the  dynamic 
properties  of  the  major  optical,  electrical  and  electro-mechanical 
system  components.  A  computer  simulation  program,  utilizing  the 
mathematical  model,  is  developed.  The  validity  of  the  model  is 
assured  by  comparing  the  experimental  and  simulated  system 
responses  to  various  operational  conditions.  The  developed  model 
provides  a  versatile  and  cost-effective  alternative  for 
verification  of  novel  concepts  in  optical  intersatellite 
crosslinking,  and  specifically,  advanced  control  strategies. 


4-2 


MATHEMATICAL  DESCRIPTION,  COMPUTER  SIMULATION  AND  ANALYSIS 
OF  A  POINTING,  ACQUISITION  AND  TRACKING  SYSTEM 
FOR  OPTICAL  INTERSATELLITE  CROSSLINKS 

Carl  R.  Herman 

INTRODUCTION 

The  PAT  system,  created  by  Ball  Aerospace  Corporation,  was 
created  to  demonstrate  three  concepts  in  rapid-retargeting, 
bi-directional  laser  communications  between  independent  space-based 
stations, 

1)  an  open-loop  pointing  system  capable  of  rapid  selection 
and  sequential  communication  to  a  large  number  of  optical  stations, 

2)  an  open-loop  pointing  system  with  multiple  beam  directors 
and  beam  switching  capabilities  allowing  multiplexing  in  time  and. 
beam  slewing, 

3)  a  closed-loop  pointing  system  with  beam  switching  that 
allows  for  sequential  establishment  of  a  large  number  of  two-way, 
closed-loop  communication  links. 

The  system  consists  of  one  stationary  terminal,  one  mobile 
remote  terminal,  and  one  stationary  remote  terminal,  representing 
three  satellites  in  low  earth  orbit,  medium  earth  orbic,  or 
geosynchronous  earth  orbit.  The  primary  station  includes  its  own 
electronics  rack  and  computer,  while  the  remote  stations  share  a 
common  electronics  rack  and  computer. 

In  addition  to  the  demonstrations,  operation  of  the  optical 
system  for  each  terminal  is  characterized  by  a  set  of  pretests  that 
measure  critical  performance  parameters  (Table  1) . 
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Acquisition  beam  power 

Acquisition  beam  irradiance  profile 

Communication  beam  power 

Communication  beam  irradiance  profile 

Bit  Error  Rate  (BER)  vs.  received  power 

Average  BER  at  primary  station 

Average  BER  at  remote  stations 

Average  BER  while  tracking  a  moving  target 


Table  1.  Pretests  performed  before  operation  of  the  PAT 
system. 


The  primary  station  hardware  includes  a  Hewlett  Packard  Model 


330  computer,  an  electronics  rack  housing  the  support  electronics 
for  the  electro-optic  and  electro-mechanical  components,  and  an 


optical  table  where  the  optical  components  are  mounted. 


The  primary  station  optic  and  electro-optic  systems  can  be 
categorized  into  six  groups:  lasers,  detectors,  beam  conditioners, 
ray  optics,  filters,  and  beam  directors  (see  Figure  1)  .  The  two 
lasers  of  the  primary  station  are  semiconductor  lasers  with  optical 
output  powers  of  10  mW  and  100  mW,  and  center  wavelengths  of  830  nm 
and  800  nm  respectively.  There  are  three  types  of  optical 
detectors  in  the  primary  station,  two  types  for  tracking  purposes 
and  one  for  communications.  The  tracking  detector  is  a  quadrant 
silicon  avalanche  photodiode,  the  acquisition  detectors  are 
improved  tetra-lateral  position  sensitive  devices  (PSD)  and  the 
communications  detector  is  a  silicon  avalanche  photodiode.  The 
beam  conditioners  and  ray  optics  assure  the  required 
characteristics  of  the  communication  and  beacon  laser  beams  for 
detection  and  transmission.  The  optical  filters  are  used  to 


4-4 


separate  these  beams  so  that  each  can  be  processed  individually. 
The  beam  directors  (CSM  1  and  CSM  2)  and  the  fast  steering  mirror 
(FSM)  are  used  to  direct  the  transmitted  beams  and  to  position  the 
received  beams. 

The  primary  station  electric  and  electro-mechanic  systems  can 
be  categorized  into  five  groups:  the  computer,  the  computer 
interface,  the  fine  steering  subsystem,  the  coarse  steering 
subsystem,  and  the  wide  field  of  view  (WFOV)  subsystem.  The 
computer  interface  consists  of  a  HP  general  purpose  input/output 
(GPIO)  board  and  interface  electronics.  Both  the  fast  steering 
subsystem  and  the  coarse  steering  subsystem  are  analog  closed-loop 
control  circuits  activated  by  the  computer.  The  WFOV  camera  is. 
stationary  and  therefore  contains  no  control  system.  The  camera's 
sole  purpose  is  to  detect  the  remote  stations  beacon  light  emitting 
diodes  (LEDs)  and  provide  initial  pointing  information  which  is 
directly  input  to  the  computer.  The  function  of  each  subsystem  and 
the  interaction  between  subsystems  is  controlled  by  the  computer 
(Appendix  A) . 

The  remote  stations  consist  of  a  HP  Model  330  computer,  an 
electronics  rack  housing  the  support  electronics  for  electro-optic 
and  electro-mechanic  components,  and  two  optical  tables,  one 
stationary  and  one  mobile,  where  optical  components  are  mounted. 

Like  the  primary  station,  the  remote  stations  optic  and 
electro-optic  systems  are  composed  of  lasers,  detectors,  beam 
conditioners,  ray  optics,  filters,  and  beam  directors  (see  Figure 
2) .  The  two  lasers  in  the  remote  station  are  semiconductor  lasers 
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with  optical  output  powers  of  10  mW  and  50  mW,  and  center 
wavelengths  of  780  nm  and  810  nm  respectively.  The  communications 
detector  is  a  silicon  avalanche  photodiode  and  the  acquisition 
detector  is  a  silicon  photodiode.  Each  station  contains  beam 
conditioners  and  filters  to  prepare  the  communication  and  beacon 
laser  beams  for  detection  and  transmission.  As  with  the  primary 
station,  the  optical  filters  isolate  the  four  wavelengths  of  light 
in  the  system  so  that  each  can  be  processed  individually. 

As  with  the  primary  station,  the  function  of  each  subsystem 
and  the  interaction  between  subsystems  is  controlled  by  the 
computer  (Appendix  A) . 

PRINCIPLE  OF  SYSTEM  OPERATION 

The  control  system  in  the  PAT  station  has  two  principal 
functions:  to  point  the  communications  laser  from  the  primary 
station  to  remote  stations,  and  to  position  the  communications 
laser  from  the  remote  stations  c.i  the  communication  receiver  of  the 
primary  station.  The  operation  of  this  control  system  is  based  on 
three  stages  of  positioning;  initial  coarse  steering  mirror  (CSM) 
orientation,  coarse  steering,  and  fine  steering. 

The  first  stage  uses  the  "very  coarse"  information  on  target 
position  obtained  from  the  WFOV  camera  to  point  the  primary  station 
lasers  at  the  remote  stations.  The  image  created  by  the  beacon 
LEDs  located  on  each  of  the  remote  stations  is  captured  by  the 
optics  of  the  WFOV  camera.  This  image  is  analyzed  and  the 
coordinates  of  the  targets  are  input  directly  into  the  computer. 
The  computer  uses  this  information  to  calculate  the  displacements 
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of  the  coarse  steering  mirrors  required  for  the  initial  CSM 
orientation.  At  this  stage  the  primary  station  beams  are  pointed 
in  the  direction  of  the  remote  stations  and  the  communication  and 
beacon  beams  coming  from  the  remote  stations  will  be  used  for  fine 
positioning  of  the  incoming  communication  beams.  This  initial 
orientation  of  the  CSMs  does  not  rely  on  feedback  information  on 
beam  position  and  is  therefore  open- loop  in  nature. 

The  next  stage  implies  positioning  the  communication  laser 
based  on  the  signals  generated  by  the  acquisition  detectors  which 
provide  information  on  the  deviation  of  the  laser  position  from  the 
center  of  the  PSD.  These  signals  are  fed  back  directly  to  the 
servo-control  electronics  of  the  CSMs.  In  this  fashion  the 
closed-loop  control  system  adjusts  the  orientation  of  the  CSMs  to 
maintain  the  communication  laser  position  in  the  center  of  the 
PSDs.  When  a  communication  laser  is  positioned  in  the  center  of  a 
PSD  it  is  then  said  to  be  "acquired". 

The  final  stage  in  positioning  the  laser  uses  the  incoming 
beacon  laser  from  the  remote  stations  to  provide  precise  fine 
position  information.  The  beacon  laser  is  narrowed  by  a  4 OX 
telescope  (see  Figure  1)  and  then  focused  onto  a  quadrant  silicon 
avalanche  photodiode.  The  positioning  error  information,  obtained 
by  the  quadrant  detector,  is  fed  back  directly  to  the  servo-control 
electronics  of  the  FSM.  In  this  fashion  the  closed-loop  control 
system  continually  adjusts  the  orientation  of  the  FSM  to  maintain 
the  beacon  laser  position  in  the  center  of  the  quadrant  detector. 

The  design  of  the  optical  system  assures  that  if  the  beacon 
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laser  is  positioned  correctly  on  the  quadrant  detector  the 
communications  laser  will  be  positioned  on  the  communications 
receiver.  Thus  the  communications  laser  is  used  for  coarse 
positioning  and  the  beacon  laser  is  used  for  fine  positioning. 
When  a  target  is  moving,  the  quadrant  detector  provides  the 
information  on  position  change  and  adjusts  the  FSM  to  compensate. 
ENVIRONMENTAL  EFFECTS 

Optical  interference  from  external  sources  is  a  major  concern 
in  the  operation  of  space-based  laser  communication  systems.  To 
simulate  optical  noise  in  the  PAT  system  "white"  light  is 
introduced  into  the  communication  laser's  path  by  a  fiber  optic 
link  (see  Figure  1)  .  White  light  simulates  typical  radiation 
spectra  encountered  in  space-based  systems. 

Platform  vibration  (jitter)  in  satellite  communication  systems 
is  one  of  the  most  important  issues.  Internal  cyclic  moments 
created  by  gyroscopes,  servo-mechanisms,  etc.,  thermal 
expansion/contraction  of  the  platform  body,  and  gravitational  field 
effects  create  a  constant  source  of  platform  vibration.  Other 
sources  of  vibration  exist  that  are  not  cyclic  in  nature  and  create 
dynamic,  unpredictable  perturbations  in  the  stability  of  the 
platform.  Table  2  presents  a  list  of  possible  sources  of  platform 
vibration  or  "jitter".  The  dominance  of  any  one  source  of 
vibration  will  depend  entirely  on  the  physical  nature  of  the 
satellite  and  on  the  specifics  of  operation.  Simulated  platform 
jitter  is  introduced  into  the  PAT  system  by  a  galvanometer  placed 
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Sources  of  on-board  mechanical  vibration 

Gyroscopes,  servos,  etc. 

Earth’s  central  gravitational  field 
Elastic  forces  of  tension  and  bending 
Effects  of  initial  conditions 
Ellipticity  of  orbit 
Earth  oblateness  effects 
Solar  and  lunar  gravity 
Solar  radiation  pressure 
Micrometeoroid  impacts 
Thermal  bending 

Table  2.  Possible  sources  of  platform  vibration  on  a 
satellite. 

in  the  optical  path  of  the  primary  station  (see  Figure  1)  .  The 
spectrum  of  vibration  frequencies  introduced  by  the  galvanometer  is 
dominated  by  low  frequencies  with  a  cutoff  of  about  100  Hz. 

mathematical  pescfiptiqh  PF  THE  PAT  SYSTEM 

Functional  and  block  diagrams  are  used  as  graphical  aids  to 
demonstrate  hardware  configuration  and  information  flows.  Explicit 
mathematical  formulas  for  each  system  block  are  derived  from 
circuit  analysis  (Appendix  C)  ,  ray  optics,  electro-optic  material 
analysis  and  manufacturer  specifications  depending  on  the  nature  of 
the  block. 

The  hardware  configuration  and  flow  of  information  for  the 
overall  system  is  shown  in  the  functional  diagram  in  Figure  3.  It 
features  three  major  feedback  loops,  one  in  the  fine  steering 
subsystem,  one  in  the  coarse  steering  subsystem,  and  one  formed  by 
the  WFOV  subsystem  and  the  computer  interface. 

Figure  4  shows  the  fine  steering  functional  diagram  which 
actually  contains  two  "local”  information  feedback  loops,  one  for 
optical  feedback  and  one  for  mirror  position  feedback.  Note  that 
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although  both  feedback  loops  are  directly  dependant  upon  the  FSM 
angular  position,  they  do  not  use  the  same  control  law.  The  error 
signal  for  the  optical  loop  is  derived  from  the  laser's  position  on 
the  quadrant  detector,  while  the  error  for  the  position  loop  is 
obtained  from  the  mirrors'  angular  position. 

The  coarse  steering  functional  diagram  is  shown  in  Figure  5 
and  shows  four  "local"  information  feedback  loops.  The  four  loops 
bring  information  from  the  galvanometers,  the  fine  steering 
differential  impedance  transducers  (DITs)  interface,  the  optical 
PSDs  and  the  WFOV  camera  via  the  D/A  converter  in  the  servo 
-electronics  block.  Each  of  the  four  loops  brings  information  back 
to  the  servo-electronics  block  for  processing.  The  CSM  assembly 
contains  its  own  electronics  and  internal  feedback  system  which  is 
not  shown  in  this  diagram.  For  more  information  on  the  CSM  block 
see  Appendix  B. 

The  WFOV  functional  diagram  is  shown  in  the  Figure  3.  The 
optical  information  obtained  by  the  WFOV  camera  is  generated  by  an 
image  of  the  beacon  LEDs  on  the  remote  stations  and  not  by  laser 
position  as  is  the  case  for  the  acquisition  and  tracking  detectors. 
This  information  is  directly  introduced  to  the  PC  via  the  GPIO 
interface  and  is  used  for  adjusting  the  CSM  position  either 
directly  or  by  adding  position  information  to  the  other  three 
feedback  signals.  In  either  mode  of  operation  the  WFOV  functional 
diagram  does  not  contain  "local"  feedback  loops. 

The  fine  steering  block  diagram  is  shown  in  Figure  6  and 
operates  in  two  modes  controlled  by  switches  SF1  and  SF2.  The 
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computer  controls  both  switches  using  the  enable  track  line  from 
the  GPIO  board.  Fine  tracking  is  enabled  by  switching  SF1  and  SF2 
to  position  1  thereby  using  the  optical  feedback  information  to 
control  the  positioning  of  the  fine  steering  mirror.  When  the 
switches  are  in  position  2  the  DITs  feedback  information  is  used  to 
control  the  mirror  position.  The  mathematical  description  of  each 
block  in  the  fine  steering  block  diagram  follows. 

Fine  Steering  Transfer  Functions 

Compensator: 


Gpi  =  -12. 0763(s+60. 7275) / (s+54. 9665) , 

Gp2  =  -263600 (s+3 07 8 . 8) / ( (s+71808) (s+11302) )  , 

Gf3  =  -2308.1/ (S+2308.1) . 

Coordinate  Rotation: 

Gf4  =  -2.004  x  106/ (S+2 . 004  X  106) 

Current  Driver,  Reaction  Mass  Actuator,  Steering  Mirror  Actuator: 

Gf5  =  17.0983  X  106/ (S2+672 . 8S+17 . 0983  X  106) 

Note:  s  -  is  Laplace  operator  (symbol  of  differentiation) 

Fine  Steering  Static  Relationships 
Primary  SIM  AQC  Main,  Rpl: 


XFO^)  *  C(IqX1+IQX2)-<IQX3+IQX4)3  /  tXQXl+XQX2+XQX3+IQX43 
YFo(t)  35  (  (XQY2+XQY4)  “(XQY1+XQY3)  3  '/  £ XQY1+XQY2+XQY3+XQY4 3 


XQX1  “  XQY1S 


Quadrant  Detector,  RF2: 

V  8  <«W-  ( I  Xq-Xbx!  1 2+ 1  I  ■ 2 )  >>)  , 

0 ,  Otherwise 

lo  /« 0W  <  I  l2+l  yq-ym2  I2)*1) , 

Otherwise 


lQX2 


_rxo/ 

CQY2~ 

L  o, 


if  XQ<XRX  and  yq>yrx 
if  Xq<Xrx  and  Yq<^rx 


4-11 


Geometry,  Ray  Optics,  Rf3: 

Xg(t)  =  (d+Ad)  [sin(aFXo+2AaFX(t)-sin(aFXo)  ) 
Yg(t)  -  (d+Ad)  [sin (aFyo+2AaFy(t) -sin (aFyo)  ] 
Kaman  DITs  Electronics,  RF4: 

XFP  =  XFPo  t  i  IDITX1 1  “  I  IDITX2  I  ) 

YFP  “  YFPO  (  I  XDITY1 1  "  I  IDITY2  I  ) 

Differential  Impedance  Transducers,  RF5: 

^ditxi  =  *1  ( ^‘ox^’^’mx® ®fx  ( t ) ) )  sin  ( ^osc^ 

*DITX2  =  “Kl^^OX+^‘MXS^n^0!FX^)  )  )  sin(uOSc) 
^DITYl  =  K1  (^OY+^*MYS^n^aFY  (t)  )  )  Sin  (WqSC^ 
^DITY2  ~  ”K1  (ixjY^'^'MY3^11  (®FY  (t)  ))  sin  (“OSC^ 


The  coarse  steering  control  block  diagram  of  the  CSM  subsystem 
is  displayed  in  Figure  7.  The  computer  operates  switches  sci 
through  Scl0,  controlled  through  the  GPIO  board,  to  change  the  mode 
of  operation  between  pointing,  coarse  tracking,  and  fine  tracking. 
Pointing  is  enabled  by  opening  all  switches  and  selecting  SC5  and 
scio  to  position  2  thereby  using  the  WFOV  feedback  information  from 
the  computer  to  control  the  positioning  of  the  coarse  steering 
mirrors.  Coarse  tracking  is  enabled  when  switches  Sc5  and  Scl0  are 
in  position  1  and  the  LEC  and  Galvo  feedback  switches  are  closed. 
Fine  tracking  is  enabled  when  switches  Scl,  Sc6,  Sc2,  and  Sc7  are 
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closed,  closing  the  DIT  and  Galvo  information  loops. 

The  mathematical  representation  of  each  block  in  the  coarse 
steering  block  diagram  follows. 

Coarse  Steering  Transfer  Functions 

Pre-filters: 

GC1  -  -6666.7/ (s+550. 964) 

Gc2  =  -200401/ (s+106) 

Gc3  «  -106/(S+106) 

Coarse  steering  Command  Electronics: 

Gc4  -  -106/(s+106) 

Gc5  »  -0.02315(s+701.459)/(s+325.489) 

Gc6  »  -60.6061/(8+60.6061) 

CSM  Control  Electronics,  steering  Mirror  Actuator: 

Gc7  «  756.072  X  103/(s2+395.637s+756.072  X  103) 

Coarse  steering  static  Relationships 
Acquisition  Main  Electronics  Rcl: 

Xco<t>  “  C  (IPSD3+IPSD4)”^IPSD1+IPSD2)  3  /  [IPSD1+IPSD2+IPSD3+IPSD4J 

YCO^)  “  t  (IPSD1+IPSD3)”^PSD2+IPSD4)  1  /  [IPSD1+*PSD2+IPSD3+IPSD43 
Position  Sensitive  Device,  Rg2 : 

IpSDl*35IoC(L-XA)/2L  +  (L+YA)/2LJ 

IPSD2ss35Io[(L-XA)/2L  +  (L-YA)/2L] 

XpSD3sa35IoC(L+XA)/2L  +  (L+Ya)/2L] 

IPSD4"  %I0[(L+Xa)/2L  +  (L-Ya)/2L] 

Geometry,  Ray  Optics,  Rc3: 

XPSD(t)  =  d[sin(aCXo+2Aocx(t)-sin(aCXo)  ] 

^psd^^  “  d[sin(aCYo+2AaCY(t)-sin(aCYo)  ] 
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Current  to  Voltage  Conversion,  R^: 

XCP  =  XCPo  (  I  ^-CAPXl  I  +  I  ICAPX2  I  ) 

YCP  =  YCPo  (  I  ICAPY1 1  +  I  *CAPY2  I  ) 

Capacitive  Rotation  Sensors,  R^: 

^CAPXl  “  K2[IXo+IXms^n^aCX^t)  H 
ICAPX2  ~  K3CIXo+IXms^n^aCX^t)  )  3 
XCAPY1  “  K4tIYo+IYmS^n^aCY^t) )] 

ICAPY2  *  KstIYo+IYms^n(aCY^t^ ) 3 
COMPUTER  SIMULATION  PROGRAM 

The  "PAT  Simulation  Program"  which  implements  the  mathematical 
model  is  written  in  about  1300  lines  of  Turbo  Pascal  computer  code 
(Appendix  D) .  This  program  simulates  the  dynamic  responses  of  the 
coarse  and  fine  steering  subsystems,  positions  of  the  acquisition 
and  tracking  lasers  on  the  PSD  and  the  quadrant  detector 
respectively,  and  the  effects  of  random  and  periodic  disturbance 
signals. 

The  PAT  simulation  program  allows  the  user  to  customize  the 
spectrum  of  the  disturbance  signal  to  simulate  a  large  variety  of 
satellite  configurations  and  environments.  White  noise  disturbance 
signals  are  contained  in  a  user  adjustable  low  frequency  envelope 
and  can  be  adjusted  in  magnitude  to  fit  any  typical  low-pass 
frequency  spectrum.  An  adjustable  "frequency  spike"  can  be  placed 
at  any  frequency  below  1  KHz  to  simulate  a  single  frequency  cyclic 
disturbance  signal  (Figure  8) .  After  the  user  specifies  the 
disturbance  signal,  the  program  computes  and  displays  the 
disturbance  both  in  the  time-  and  frequency-domains.  This  provides 
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an  opportunity  to  verify  the  disturbance  signal  and  change  it  if 
necessary.  The  PAT  simulation  program  has  both  one-  and  two- 
dimensional  display  capabilities,  as  shown  in  Figures  9  through  13. 
Each  plotting  mode  features  the  dynamic  responses  of  the  coarse  and 
fine  steering  mirrors,  specifically  X  and  Y  positions  of  the  laser 
beams.  However,  in  the  first  model  X  and  Y  positions  are  displayed 
separately  as  feedback  signals  of  the  appropriate  control  loops, 
while  the  second  mode  displays  laser  position  which  represents  the 
feedback  control  signals  for  both  the  acquisition  and  tracking 
detectors.  The  magnitude  of  the  control  signals  is  directly 
proportional  to  the  initial  X  and  Y  coordinates  of  the  remote 
stations  with  respect  to  the  orientation  of  the  primary  station. 
The  two  flat  lines  represent  the  X  and  Y  control  signals'  final 
rest  position  after  the  target  has  been  acquired. 

The  two  dimensional  simulation  shows  the  position  of  the 
communication  and  beacon  lasers  on  the  acquisition  and  tracking 
detectors,  as  shown  in  Figures  11  and  12.  This  simulation 
represents  a  static  translation  from  mirror  angular  position  to 
laser  position  on  the  detectors  and  is  directly  proportional  to  the 
optical  feedback  signals  X^,  Y^,  XFO,  and  Y^. 

M9PEL  VERIFICATION 

To  test  the  accuracy  of  the  defined  mathematical  model  a  step 
voltage  signal  was  introduced  into  the  reference  input  to  the  fine 
steering  subsystem.  Figure  14  shows  an  oscilloscope  photograph  of 
the  obtained  step  response  as  measured  from  the  X  position  DITs 
interface.  Comparing  this  to  the  model  response  in  Figure  13  shows 
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good  correlation.  The  same  test  was  performed  on  the  coarse 
steering  subsystem  with  similar  results. 
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Fine  Steering  Functional  Diagram 


Quadrant  Detector 


Figure  4.  Fine  steering  functional  block  diagram. 


Coarse  Steering  Functional  Diagram 
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CONGESTION  CONTROL  FOR  ATM  NETWORKS  IN 
A  TACTICAL  THEATER  ENVIRONMENT 


Benjamin  w.  Hoe 

Department  of  Electrical  Engineering 
Polytechnic  University 

AfrStEast 

Rome  Lab  is  currently  developing  a  next-generation 
experimental  network  known  as  the  Secure  Survivable  Communications 
Network  (SSCN) ,  based  on  broadband  Asynchronous  Transfer  Mode  (ATM) 
technology.  In  ATM,  all  types  of  information  (data,  voice,  video, 
message)  are  placed  in  53  byte  long  packets  (ATM  cells)  and 
transmitted  over  available  media.  In  the  commercial  arena,  the 
primary  transmission  medium  for  ATM  is  fiber,  with  rates  beginning 
at  OC-3  (155.56  Mbps),  using  the  STS-3C  SONET  protocol.  Because  of 
its  high  transmission  speed  and  switch  architecture,  congestion 
control  and  queue  management  in  ATM  networks  becomes  an  important 
and  complex  research  issue,  especially  when  complicated  by  the  low- 
throughput,  high  bit-error  transmission  links  encountered  in  the 
military  tactical  environment.  This  issue  is  further  complicated 
in  military  conditions  where  traffic  patterns  are  dynamically 
changed  by  jamming,  degradation  of  communications  resources  and 
security  requirements.  There  is  much  literature  on  and  many 
techniques  for  congestion  control  in  "normal"  traffic  conditions; 
however,  congestion  control  in  dynamic  environments  (tactical 
theater  environment)  is  a  poorly  documented  area  in  ATM  networking. 
The  congestion  control  and  queue  management  techniques  used  in  the 
SSCN  project  should  not  only  be  capable  of  handling  traffic  at  OC-3 
rates  (155.56  Mbps),  but  should  be  able  to  evolve  to  support  rates 
in  the  OC-12  (622  Mbps)  range  and  beyond  in  the  future.  First, 
this  paper  presents  an  analysis  of  congestion  control  techniques  to 
be  used  in  the  SSCN  program.  Then,  potential  research  issues 
related  to  congestion  control  in  a  tactical  environment  are 
discussed  and  future  research  issues  are  identified. 
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CONGESTION  CONTROL  FOR  ATM  NETWORKS  IN 
A  TACTICAL  THEATER  ENVIRONMENT 


Ben  W .  Hoe 

Department  of  Electrical  Engineering 
Polytechnic  University 


JU  Introduction 

Rome  Laboratory  is  currently  developing  an  experimental 
wide-area  communications  network  known  as  the  Secure  Survivable 
Communications  Network  (SSCN)  in  order  to  meet  the  Tactical 
Communications  Land  Combat  Zone  Post-2000  needs  for  integrated 
information:  voice,  data,  video,  and  message.  The  experience 
gained  from  Desert  Storm  also  suggests  that  future  military 
communications  networks  must  be  highly  robust  and  capable  to 
allocate  integrated  services  (voice,  data,  video)  under  stressed 
conditions.  The  increasing  use  of  high  technology  such  as 
target  identification,  pattern  recognition  using  image 
processing  techniques  and  large  computer  data  file  transfers 
require  broader  bandwidth  than  existing  communications  networks 
can  provide.  The  ATM  concept  provides  adequate  bandwidth,  but 
effective  congestion  control  techniques  under  stressed 
conditions  are  yet  to  be  developed.  Most  of  the  queue 
management  and  congestion  control  techniques  are  designed  for 
commercial  applications  where  the  prediction  of  user  traffic 
patterns  is  relatively  simple  and  well  defined.  In  military 
applications,  the  characteristics  of  a  network  are  different?  it 
has  higher  bit  error  rate  due  to  the  enemy  jamming,  dynamically 
changing  traffic  flow  patterns  and  diverse  communications  media. 
All  these  dynamic  factors  can  be  categorized  as  shown  in  next 
section,  and  used  in  evaluation  of  congestion  control  techniques 
which  are  being  conducted,  and  are  proposed  to  continue  as 
follow  up  research  under  the  Research  Initiation  Program  (RIP) 
sponsored  by  The  Air  Force  Office  of  Scientific  Research  (AFOSR) 
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[4].  The  congestion  control  technique  used  in  the  SSCN  program 
should  not  only  be  capable  to  manage  traffic  with  0C3  speed 
(155.56  Mbps)  but  should  be  able  to  evolve  to  0C12  (622  Mbps) 
and  beyond  in  the  future. 


Congestion  control  techniques  to  be  used  in  the  SSCN  testbed 
and  related  research  issues 

In  traditional  queue  management,  output  buffering  has  been 
shown  to  be  the  best  when  considering  delay/throughput 
performance  [5].  GTE  corporation  has  implemented  a  technique 
which  uses  both  input  and  output  buffers  for  congestion  control 
in  the  SSCN  project  [3].  Before  an  ATM  cell  is  processed  within 
this  congestion  control  mechanism,  the  prioritization  is 
performed  by  using  the  contents  of  a  GTE-defined  node  tag.  The 
node  tag  is  5  bytes  (octets)  long  and  is  appended  to  an  ATM  cell 
before  it  is  injected  into  the  ATM  switch  fabric  module.  It 
contains  16  bits  for  internal  routing  information,  3  bits  for 
cell  loss  priority  control,  3  bit.s  for  hardware  priority  level, 
1  bit  for  Audit  Trail  Cell  and  bits  for  parity  check.  The  Audit 
Trail  Cell  (ATC)  flag  is  used  to  prevent  corrupted  routing 
information  from  resulting  in  the  misrouting  of  data.  This  flag 
is  set  upon  detection  of  a  parity  error.  The  congestion  control 
technique  utilizing  both  input  and  output  buffers  is  illustrated 
in  figure-1  and  figure-2. 

It  is  designed  such  that  congestion  occurs  at  the  output 
buffers  first  and  is  pushed  back  to  the  input  buffers.  The 
input  buffers  start  to  queue  up  the  traffic  as  the  traffic  in 
output  buffers  exceeds  the  predetermined  threshold.  Observe  in 
figure-1,  that  the  traffic  at  the  output  queues  exceeds  the 
threshold,  so  an  input  queue  starts  queuing  up  the  incoming 
traffic  until  the  output  queues  are  available  for  service  again. 
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Drop  low  priority  Cell  with  Audit  Trail 

cell  when  input  buffer  Requested  is  rerouted  to 

is  50%  full  local  control  port  when 

input  buffer  is  90%  full 


Figure  1.  Input  buffer  response  when  output  buffer  is  congested 


If  the  congestion  at  the  output  queues  continues  so  that  the 
input  buffer  continues  to  fill  up  and  exceeds  50%  of  its  queue 
length,  then  the  input  queue  will  start  dropping  cells  according 
to  their  priority  level.  First,  low  priority  cells  are  dropped 
as  shown  in  figure-1  and  once  input  queue  is  90%  full,  the  cells 
with  the  audit  trail  bit  set  will  be  re-routed  to  the  switch 
control  port.  These  audit  trail  bits  are  used  for  security 
reasons  and  a  cell  with  the  audit  trail  bit  set  cannot  be 
dropped  until  it  is  rerouted  to  local  control  module  and 
accounted  for  [3]. 

This  technique  provides  temporary  congestion  avoidance,  but 
more  flexible  use  of  the  input  queues  can  minimize  unnecessary 
congestion  in  the  input  circuit.  Consider  the  following  simple 
example,  where  all  output  queues  are  full  except  the  output 
queue- 1.  Because  of  the  back  pressure  method  some  input  queues 
start  to  queue  up  the  traffic  until  the  output  queue  is  once 
again  available.  As  illustrated  in  figure-2,  the  Nth  input 
queue  buffers  the  traffic  because  the  first  cell  inside  the 
queue  is  destined  to  Nth  output  queue  which  is  congested. 
However,  the  second  cell  in  Nth  input  queue  is  destined  to  1st 
output  queue  which  is  available  and  in  an  idle  state.  But  this 


input  queue  Switch  output  queue 


Figure  2.  Queue  model  with  input  and  output  buffering 
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cell  cannot  go  to  output-1  until  the  cell  ahead  of  it  is  de¬ 
queued  first.  One  possible  way  to  solve  this  problem  is  by 
relaxing  the  FIFO  concept  and  employing  an  appropriate  queuing 
algorithm  at  the  input  buffer. 

Another  potential  research  area  in  queue  management  is  to 
develop  a  method  to  handle  long  burst  traffic.  The  high  bit 
rate  traffic  (long  burst  data)  has  greater  impact  on  others  than 
the  low  bit  rate  traffic.  A  queue  model  should  have  the 
capability  to  manage  the  traffic  in  the  way  that  high  priority 
cells  will  be  served  first  and  when  the  congestion  level  exceeds 
the  threshold,  low  priority  cells  will  be  discarded,  and  provide 
equal  treatment  to  same  priority  cells.  Consider  the  following 
case  where  long  burst  traffic  from  user-A  arrives  at  the  queue 
first  and  then  low  burst  traffic  from  user-B  arrives  at  the 
queue.  We  assume  that  both  user-A  and  user-B  have  the  same 
priority  level.  In  this  case,  equal  treatment  for  same  priority 
users  is  not  possible  since  long  burst  traffic  stream  occupies 
the  queue  and  the  server  longer  than  the  low  burst  traffic. 
Consequently,  the  mean  delay  for  low  burst  traffic  user-B  is 
higher  than  that  of  high  burst  traffic  user-A  [1]. 


Figure  3.  The  effect  of  long  burst  traffic  on 
a  low  burst  traffic  user 
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This  problem  can  be  solved  by  assigning  a  departure 
sequence  number  based  on  a  concept*  known  as  Virtual  Clock.  It 
is  expected  that  more  long  burst  traffic  will  be  generated  by 
users  in  war  time  scenarios  than  in  normal  operation.  The  study 
and  analysis  of  bursty  traffic  and  its  impact  on  low  data  rate 
users  is  critical,  and  utilization  of  appropriate  queuing 
algorithm  pertaining  this  issue  can  enhance  the  performance  of 
the  congestion  control  mechanism  under  worst  case  scenarios. 


3.  Research  Effort 

The  research  on  congestion  control  mechanisms  for  tactical 
networks  was  conducted  jointly  by  the  author  and  Rome  lab 
personnel  with  the  technical  information  obtained  from  GTE 
Government  Systems.  The  objective  of  this  research  was  to 
identify  the  potential  research  issues  associated  with 
congestion  control  techniques  in  tactical  ATM  networks,  create 
the  simulation  model,  and  perform  analysis  with  different 
dynamic  parameters.  As  result  of  this  research,  we  have 
identified  some  critical  issues  which  can  be  classified  into  the 
following  three  major  areas: 

1)  Analysis  of  diverse  traffic  classes 

2)  Analysis  of  diverse  transmission  media 

3 )  Threat  scenarios 


3.1.  Analysis  of  diverse  traffic  classes 

The  diverse  traffic  types  must  be  identified  and  classified 
for  utilization  in  simulation  and  performance  analysis  of  ATM 
switches  in  a  military  environment.  The  analysis  of  diverse 
traffic  classes  and  their  impact  on  tactical  ATM  networks  is 
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crucial  since  each  type  of  traffic  has  different  characteristics 
in  terms  of  bandwidth  sensitivity  and  bit  error  rensitivity. 
For  example,  video  signals  are  delay  sensitive,  where  long 
delays  of  video  signals  will  result  in  unrecognizable  images  at 
the  receiver  end.  However,  one  bit  error  on  video  is  acceptable 
since  it  will  not  degrade  the  quality  of  the  image 
significantly.  On  the  other  hand,  data  is  very  sensitive  to  bit 
errors?  one  bit  error  can  result  in  a  completely  different 
expression  at  the  receiver  end.  The  traffic  arrival  pattern  is 
another  critical  factor  in  the  tactical  environment.  The 
traffic  flow  in  peace  time  (normal  operational  mode)  is  similar 
to  the  traffic  patterns  in  commercial  networks  with  some 
additional  security  features.  However,  traffic  patterns  in  a 
hostile  environment  will  be  different.  In  hostile  operational 
modes,  the  traffic  loading  on  all  links  will  be  changing 
dynamically.  Some  links  or  channels  will  be  destroyed  or 
degraded  by  enemy  attacks  and  bottleneck  effects  are  expected  on 
remaining  links. 

In  order  to  thoroughly  understand  different  traffic  classes 
and  their  impact  on  congestion  control,  the  simulation  of 
different  traffic  patterns  with  different  behaviors  is  required, 
using  the  node  model  to  be  implemented  in  the  SSCN  testbed.  The 
study  of  different  traffic  patterns  has  been  initialized  in  Rome 
Lab  as  part  of  the  summer  research  effort,  but  more  detailed 
analysis  and  evaluation  of  diverse  traffic  classes  remains  to 
be  done,  because  of  the  time  limitation  and  administrative  delay 
in  technical  information  transfers  from  GTE  Government  Systems. 

3i2t  Analysis  al  diverse  tran.groi?g,i9n..  media 

The  OPNET  tool  provides  three  generic  link  models  that  can 
be  used  to  characterize  the  type  of  connectivity  that  a 
particular  transmission  medium  utilizes:  a  point-to-point  link. 
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a  bus  link  and  a  radio  link  are  available  options.  These  three 
types  of  link  connectivity  models  will  be  analyzed  in  terms  of 
their  respective  parameters.  For  example,  in  a  point-to-point 
link,  transmission  delay,  propagation  delay  and  bit  error  rate 
will  be  defined  and  analyzed.  Transmission  delay  and 
propagation  delay  characterizes  the  transmission  speed,  and  bit 
error  rate  indicates  how  much  a  link  is  corrupted  with  errors. 
It  is  known  that  higher  bit  error  rates  will  be  encountered  in 
tactical  network  than  in  commercial  networks.  The  error 
threshold  is  used  to  indicate  the  level  of  threshold  that  is 
allowed  on  particular  link.  If  a  link  (or)  a  channel  is 
corrupted  with  noise  so  that  its  error  rate  exceeds  the  error 
threshold,  then  this  particular  link  should  be  taken  out  of 
service;  error  detection  and  correction  effort  on  cells  arriving 
from  such  a  link  would  be  difficult  if  the  error  rate  is  very 
high.  In  bus  links,  packet  collision  and  closure  factor  will 
also  be  considered  in  addition  to  the  delay  and  bit  error 
factors  considered  in  point  to  point  links.  It  is  critical  to 
consider  collision  because  packet  collision  in  bus  topologies  is 
unavoidable  when  users  can  transmit  packets  to  the  same 
destination  at  the  same  time  on  the  same  medium.  The  closure 
factor  is  also  an  important  consideration,  since  it  indicates 
the  eligibility  of  connectivity  between  transmitter  and  receiver 
(e.g.,  for  security  reasons).  In  a  radio  link,  more  factors 
such  as  signal  to  noise  ratio  (SNR) ,  and  background  noise 
(thermal  noise)  must  be  included  in  the  analysis.  The  focus  of 
this  research  was  on  point-to-point  and  bus  type  links,  since 
these  two  types  will  be  used  in  the  SSCN  initial  five  node 
testbed,  while  radio  links  (which  require  detailed  link  model 
definitions)  should  be  added  in  the  future  when  the  models  of 
interest  have  been  selected  for  development  in  the  SSCN  program. 
Further  study  on  transmission  media  and  their  impact  on 
congestion  control  is  recommended  to  help  characterize  the 
behavior  of  dynamic  tactical  networks. 
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3,3  Threat  Scenarios 


Military  threat  scenarios  have  been  documented  and 
implemented  in  a  network  management  system  prototype  testbed 
under  the  Communications  Network  Operating  System  II  (CNOS  II) 
project  conducted  for  Rome  Lab  by  Stanford  Telecommunications, 
Inc.  The  CNOS  II  program  developed  an  IMS  (Integrated 
communications  network  Management  System)  testbed  containing 
extensive  features  to  define  various  threat  parameters,  and 
model  the  response  of  the  IMS  to  various  user-defined  network 
models  and  dynamic  scenarios  of  interest.  The  functionality  of 
the  IMS  and  its  role  in  tactical  ATM  networks  is  discussed  in 
the  next  section.  As  part  of  the  research  effort,  threat 
scenarios  were  identified  as  one  of  the  critical  issues  in 
tactical  ATM  networks.  The  thorough  study  of  threat  scenario 
models  available  in  the  IMS  prototype  testbed,  and  the 
adaptation  of  those  models  to  support  the  SSCN  project  is 
recommended.  Any  congestion  control  mechanisms  to  be  used  in  a 
tactical  theater  environment  must  be  evaluated  under  different 
threat  scenarios  to  ensure  their  performance  under  worst  case 
conditions.  The  main  component  of  the  SSCN  project  is  a  Multi- 
Level  Secure  (MLS)  Tactical  Switch  (MTS)  node,  or  ATM  gateway 
switch,  the  buffering  mechanisms  for  which  was  designed  using 
the  OPNET  design  tool  of  MIL3,  Inc..  Therefore,  it  is 
recommended  that  threat  scenarios  be  defined  using  OPNET  to 
evaluate  the  MTS  node  (ATM  switch) ,  which  is  already  modelled  on 
OPNET.  Our  research  reveals  no  significant  work  has  been  done 
on  defining  threat  scenarios  using  the  OPNET  tool,  hence,  the 
author  and  Rome  Lab  identify  the  modeling  of  threat  scenarios  on 
OPNET  as  potential  future  research  work. 
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4.,  The  Integrated  Communications  Network  Management  System  (IMS) 
and  its  role  in  the  SSCN 


The  Integrated  Communications  Network  Management  System 
(IMS)  is  a  next-generation  network  management  system  which  was 
developed  under  the  Communications  Network  Operating  System  II 
(CNOS  II)  project.  The  CNOS  II  project  was  a  follow-on  to  the 
CNOS  I  project,  which  was  initiated  by  Rome  Lads  to  define  the 
high  level  architecture  for  a  highly  robust  and  survivable 
communications  system  that  will  integrate  multiple  networks  and 
transmission  media.  The  IMS  prototype  was  developed  under  the 
CNOS  II  project  to  evaluate  the  efficiency  and  effectiveness  of 
network  management  algorithms  in  different  dynamic  scenarios  of 
interest.  The  IMS  architecture  is  briefly  discussed  in  this 
section  because  of  it's  crucial  role  in  SSCN  project.  As 
illustrated  in  figure-4,  network  management  processing  functions 
were  partitioned  into  four  management  subsystems:  Service 
Manager,  Administrative  Manager,  Facilities  Manger,  and  Resource 
Manger.  Each  manager  facilitates  certain  functionality.  The 
Service  Manager  handles  user  interfaces,  access  control  and 
mapping  of  requested  services  to  transmission  resources.  The 
Resource  Manager  handles  transmission  system  interfaces,  and 
monitors  specific  transmission  links  and  controls  their 
operation,  including  connection  establishment.  The  Facilities 
Manager  is  responsible  for  monitoring  overall  performance  and 
access  control  to  existing  networks,  and  all  available  end-to- 
end  transmission  resources.  The  Administrative  Manager  manages 
non-real  time  functions  associated  with  user  access  priorities, 
privileges  and  security  level  as  well  as  configuration 
information  (7].  In  addition  to  these  four  network  management 
processing  subsystems,  there  is  an  information  processing 
subsystem  which  contains  the  Management  Information  Base  (MIB) 
and  Management  Transfer  Part  (MTP)  [7] [8],  The  network 
management  decision  making  algorithms  reside  in  the  management 
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Figure-4  Top  level  IMS  architecture 
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processir*  ^ocsystem.  Each  IMS  node  (domain)  develops  a  nodal 
viewpoint  (local  knowledge)  of  the  network  state  and,  through 
coordination  with  other  IMS  nodes,  attempts  global  user  service 
optimization.  A  three  node  IMS  prototype  was  developed  under 
the  CNOS  II  project,  which  contains  the  aforementioned  network 
management  functionality,  and  operates  on  simulated  traffic  and 
network  resources.  This  prototype  allows  a  user  to  define 
various  parameters  associated  with  a  tactical  theater 
environment  such  as  Mean  Time  Between  Failure  (MTBF) ,  Mean  Time 
To  Repair  (MTTR) ,  Bit  Error  Rate  (BER) ,  and  more  specific 
conditions  such  as  a  particular  link  being  disabled  at  certain 
times,  due  to  enemy  jamming,  and  operational  at  other  times. 
Such  threat  situation  can  be  defined  in  generic  classes:  link, 
node,  network,  equipment  threats,  or  more  specific  events:  a 
node  or  a  link  will  be  taken  out  of  service  at  specific  time, 
etc.  These  threat  parameters  can  be  used  in  the  analysis  of  ATM 
Switch  performance  to  evaluate  how  well  it  performs  under 
stressed  conditions.  The  IMS  plays  a  important  role  in  the  SSCN 
project;  not  only  by  its  ability  to  evaluate  the  performance  of 
a  network  in  dynamic  conditions,  but  it  is,  in  fact,  able  to 
control  the  tactical  ATM  switches  in  response  to  those  adverse 
conditions.  The  effectiveness  of  congestion  control  mechanisms 
in  an  ATM  network  also  depends  on  how  effectively  the  IMS  can 
optimize  the  mapping  of  service  requests  onto  available 
transmission  media.  For  example,  the  IMS  must  be  able  to 
allocate  resources  to  services  requested  so  that  congestion  at 
a  single  node  (or  a  group  of  nodes)  can  be  avoided. 


5 1  JZisgaaaign 

The  objective  of  this  work  is  to  identify  the  potential 
research  problems  pertaining  congestion  control  in  tactical  ATM 
networks.  As  result  of  the  research,  we  have  identified  some 
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critical  issues,  which  were  presented  in  previous  sections. 
More  detail  and  a  thorough  investigation  of  these  issues  against 
potential  queue  management  techniques,  either  proposed  by  GTE  or 
identified  by  Rome  Lab  and  the  author  is  recommended.  The 
buffering  mechanisms  in  the  MTS  ATM  switch  to  be  used  in  SSCN 
project  were  designed  with  the  OPNET  design  tool.  The  OPNET 
tool  utilizes  object  oriented  software,  and  allows  users  to 
model  a  network  hierarchically.  Network  analysis  can  be 
performed  at  different  levels  without  having  to  worry  about  the 
higher  or  lower  level  in  OPNET.  Therefore,  OPNET  was 
selected  as  our  analysis  tool  in  this  research  and  for  a  follow¬ 
up  research  effort.  One  major  task  still  remaining  to  be 
carried  out  is  defining  threat  scenarios  on  OPNET  simulation. 
OPNET  does  not  provide  direct  ways  to  define  the  threat 
scenarios,  as  the  IMS  testbed  does,  but  this  research  revealed 
that  it  is  possible  to  define  threat  scenarios  in  terms  of  link 
attributes,  HC"  code  instructions  in  a  State  Machine,  and  by 
using  different  node  attributes  and  traffic  patterns.  The 
detailed  and  complete  construction  of  threat  definitions  on 
OPNET  is  a  time  consuming  process  and  is  still  yet  to  be 
conducted. 


Conclusion 

This  research  revealed  that  further  work  in  congestion 
control  for  tactical  ATM  networks  is  required.  All  congestion 
control  techniques  to  be  used  in  the  SSCN  testbed  must  be 
thoroughly  evaluated  in  a  dynamic  environment  to  validate  the 
expected  performance  of  the  current  design  in  achieving  the  best 
possible  performance  in  a  full  scale  deployment  of  the  future 
SSCN  ATM  switches.  In  fact,  this  research  is  an  important 
component  of  a  much  larger  military  initiative  to  implement  a 
more  global  (multiple  government  &  commercial  networks,  multiple 
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vendors,  etc.)  ATM  environment  in  which  congestion  control  will 
be  much  more  complicated  by  the  size  of  the  networks,  their 
speed  and  switch  architectures. 
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Abstract 


The  Maximum  Likelihood  approach  is  described  and  applied  to  the  imaging 
of  radar  targets  undergoing  cooperative  precessional  motion  under  narrowband 
millimeter  wave  radar  signal  illumination.  A  comparison  with  conventional  pro¬ 
cessing  is  made  when  simulated  Gaussian  noise  is  added  to  the  measured  data. 
With  the  maximum  likelihood  algorithm,  useful  images  are  generated  from  rel¬ 
atively  small  amounts  of  noisy  data.  The  measured  data  used  in  the  image 
reconstructions  were  measured  at  the  Rome  Laboratory  Prospect  Hill  Facility 
in  Waltham,  MA. 
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MAXIMUM  LIKELIHOOD  BASED  IMAGING 
OF  PRECESSING  RADAR  TARGETS 


Kenneth  E.  Krause 


1  Introduction 

Radar  images  of  targets  are  formed  by  processing  the  echo  data  received  over 
an  appropriate  angular  aperture  and  frequency  bandwidth.  When  a  rotation 
about  the  axis  normal  to  the  radar  line-of-sight  is  used  in  conjunction  with  a 
wide  radar  bandwidth,  conventional  inverse  synthetic  aperture  (ISAR)  radar 
images  can  be  obtained  [1].  Images  can  also  be  obtained  from  angular  motion 
in  two  directions,  such  as  in  the  ‘angle-angle*  imaging  described  by  Brown[2], 

The  image  reconstructions  in  these  and  similar  situations  are  obtained  by 
performing  a  Fourier  transformation  on  the  received  data.  This  approach  is 
based  on  the  theoretical  relationship  between  the  received  signal  and  the  re¬ 
flectance  image  when  the  target  consists  of  independent  scatterers. 

The  above  described  processing  makes  no  provision  for  noise  in  the  data. 
The  work  described  in  this  report  describes  imaging  from  a  perspective  that 
accounts  for  additive  noise  in  the  measured  radar  data.  Specifically,  based  on 
the  assumption  that  the  target  consists  of  independent  scatterers,  an  imaging 
approach  based  on  maximum  likelihood  estimation  theory  is  used  to  estimate 
the  reflectance  image  of  a  target  when  the  measured  data  is  contaminated  by 
additive  Gaussian  noise.  Images  so  obtained  are  compared  with  images  ob¬ 
tained  by  conventional  processing.  For  the  processing  used  in  this  report,  the 
reasonable  approximation  is  made  that  the  scattering  targets  are  planar. 
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The  radar  data  measurements  were  obtained  from  the  Rome  Laboratory 
Prospect  Hill  Facility  located  in  Waltham,  MA.  The  system  there  collects  imag¬ 
ing  data  using  a  stabilized,  narrowband,  140  Ghz  transmitted  signal.  The  stabi¬ 
lized  140  Ghz  is  obtained  using  a  Gunn  diode  oscillator  at  46.7  Ghz  followed  by 
a  frequency  tripler  and  a  specially  designed  electrical  and  mechanical  feedback 
system.  See  [3]  and  [4]  for  details  on  the  hardware. 

Section  2  describes  a  conventional  processing  formulation  for  imaging  that 
was  developed  by  Rome  Laboratory.  Next,  Section  3  describes  the  maximum 
likelihood  approach  the  author  implemented  in  his  summer  work  assignment  at 
Rome  Laboratory.  Section  4  is  the  summary. 


2  Conventional  Processing 

The  geometrical  configuration  for  the  radar  measurements  and  derivation  of  the 
reconstruction  algorithm  are  described  in  [5].  This  section  summarizes  from 
[5]  the  basic  measurement  equation  and  the  notation  to  be  used  in  this  report. 
Figure  1  shows  the  measurement  configuration.  The  cooperative  target  motion 
utilized  for  imaging  is  now  described.  The  target  is  located  in  the  xy-plane 
with  rectangular  coordinates  (z,y,0)  and  polar  coordinates  (r,  ^,0)  given  by 
i  =  r  sin  and  y  =  -r  cos  ij).  We  define  the  positive  p-axis  as  lying  in  the 
zy-plane  oriented  at  angle  9  measured  from  the  negative  y-axis.  At  each  p-axis 
angle  6,  the  target  is  tilted  by  angle  a  about  the  p-axis,  and  a  coherent  radar 
measurement  made.  The  p-axis  steps  in  0  for  a  fixed  tilt  angle  a,  as  defined 
above,  for  each  radar  measurement.  The  set  of  data  collected  as  0  varies  for  a 
fixed  tilt  angle  a  is  called  a  ring  of  data. 

Based  on  this  measurement  and  geometry,  the  following  equation  is  deter- 


7-4 


Figure  1:  Radar  measurement  for  processing  target, 
mined  for  the  image  ga{r,il>)  in  terms  of  ideal  measurements,  Ga(6). 

ga(r,  *)  =  J**  Ga{0)e->*wrti**,in«’'»dO.  (1) 

The  subscript  a  denotes  that  the  image  is  due  to  the  ring  of  ideal  data  Ga(0). 

The  resolution  obtained  in  this  case  is  0.4A/sina,  the  width  of  the  central  lobe 
of  the  point  spread  function  between  zeros.  The  final  image  is  obtained  by 
adding  several  such  single-ring  images.  The  following  is  the  equation  for  the 
final  image  in  terms  of  single-ring  images. 

$(r<^)  =  22<ra(r,r/>)  (2) 

Or 

The  received  data  are  given  by  the  expression 

Ga{6)  -  e (*»  «•<+»-  ••'"*)  (3) 

for  an  ideal  point  target  of  unit  strength  located  at  coordinates 

Figure  2  provides  simulation  results  that  show  how  the  image  from  an  ideal 
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Figure  2:  Point  Target-Conventional  Processing:  a.  without  noise,  b.  with 
noise. 

point  target  varies  when  noise  is  added  to  the  data.  The  image  reconstructed 
from  data  corresponding  to  an  ideal  point  target  of  unit  strength  located  at 
coordinates  (15,  —5, 0)  is  shown  in  figure  2a.  Figure  2b  shows  the  reconstruction 
from  the  same  signal  data  as  the  previous  case,  but  with  Gaussian  noise  added. 
Simulated  point  target  data  had  unit  magnitude.  The  variance  of  the  simulated 
Gaussian  noise  added  to  the  simulated  data  was  4.0.  The  ry-plane  reference 
location  in  the  three-dimensional  plots  is  z  =  0.  It  is  shown  elevated  for  clarity 
of  viewing.  The  signal  data  used  for  this  simulation  corresponds  to  a  tilt  angle 
a  of  10®.  Increasing  non-target  structure  is  seen  as  noise  is  introduced  into  the 
signal  data. 

Images  obtained  in  [5]  were  formed  by  adding  four  rings  of  data  with  angles 
or  corresponding  to  equal  increments  of  sina  and  amat  =  10®.  Images  in  [6] 
were  obtained  using  16  rings  of  data. 

The  rest  of  this  report  will  describe  an  approach  to  single  ring  image  recon- 
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struction  that  explicitly  accounts  for  the  noise  in  the  measured  data. 


3  Maximum  Likelihood  Processing 

Equation  (3)  will  be  modified  slightly  for  interpretation  in  terms  of  a  Maximum 
Likelihood  imaging  approach  the  author  has  formulated  as  part  of  his  doctoral 
research. 

The  complex  envelope  representation  for  the  return  from  a  target  of  re¬ 
flectance  strength  Bm  at  location  (rm>  J/m.O)  is  given  by 

sma(6)  =  (4) 

where  \fEt  is  a  constant  related  to  the  transmitted  signal  energy  of  the  mea¬ 
surement.  This  representation  of  the  reflectance  by  a  constant  parameter  Bm  is 
valid  over  small  variations  in  aspect  angle  [7],  The  angle  6m  is  a  random  variable, 
uniformly  distributed  on  [—  x,x],  representing  the  uncertainty  in  target  location 
on  the  scale  of  a  wavelength  of  the  illuminating  radiation.  As  before,  a  is  the 
tilt  angle.  Scatterer  phases  will  be  denoteded  by  lower  case  subscripts.  The 
variable  6  refers  to  the  p-axis  angle.  A  specific  measurement  will  be  referenced 
by  upper  case  subscripts  consisting  of  numerals  or  upper  case  letters. 

The  received  data  are  then  described  by 

fa(8)  =  sma($)  +  w{6)  9  =  0i,0j,...,0jv  (5) 

where  w(9)  is  complex  white  Gaussian  noise  of  spectral  height  Nq.  The  author 
has  proposed  a  specific  form  of  the  likelihood  function  to  use  in  an  estimation 
based  approach  to  imaging  targets  composed  of  independent  scatterers,  in  a 
noisy  measurement  environment.  In  the  present  context,  for  single  ring  images, 
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this  likelihood  function  is 
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describe  the  likelihood  in  terms  of  all  variables  defined  so  far.  Defining  a  scaled 
reflectance  B'm  by  the  expression 

B'm  =  V2*Bm  (12) 


the  likelihood  expression  (7)  becomes 

prj  p 

Afc.WlB'J  =  exp(-^i] 


x  exp[—  y/F,B'm(L'ema  cos 6m  -  L\masin0m )],  (13) 


with 


j  /  _  x'cm  or 

emo“  s/2i 


(14) 


and 

vm.  =  ^  (i« 

Equation  (13)  is  the  form  of  the  likelihood  the  author’s  algorithm  requires 
in  order  to  obtain  the  maximizing  solution  B'm  in  terms  of  the  variables  men¬ 
tioned  above.  Once  B'm  has  been  obtained,  by  equation  (12),  the  solution  for 
refectance  Bm  at  coordinates  (xm,ym,0)  is  easily  obtained.  This  is  the  maxi¬ 
mum  likelihood  solution.  Specific  features  of  the  maximum  likelihood  algorithm 
are  a  thresholding  operation  and  solution  of  the  nonlinear  equation  obtained  by 
setting  the  derivative  of  (13)  equal  to  zero.  These  features  cause  either  B'm  =  0 
to  be  the  solution,  or  else  a  nonzero  solution  for  B'm  is  obtained  from  the  nonlin- 
i  equation  solution.  In  either  case,  this  results  in  a  conservative(low)  estimate 
of  Bm,  compared  to  conventional  processing.  Figure  3  shows  the  maximum  like¬ 
lihood  algorithm  applied  to  the  same  data  used  to  reconstruct  the  conventional 
images  of  figure  2.  Notice  that  in  the  noiseless  case,  the  maximum  likelihood 
estimate  is  the  conventional  estimate. (For  computational  purposes,  a  low  noise 
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Figure  3:  Point  Target-ML  Processing:  a.  without  noise,  b.  with  noise. 

variance  was  used  as  input  to  drive  the  algorithm  for  zero  noise  case  calcula¬ 
tions).  With  noise  present,  the  algorithm  prescribes  the  thresholding  operation 
and  calculation  of  any  nonzero  reflectance  estimates. 

Two  cases  using  measured  data  will  now  be  presented.  The  measured  data 
from  Prospect  Hill  was  normalized  so  the  maximum  data  point  magnitude  would 
be  unity.  Noisy  data  was  simulated  by  adding  zero  mean  Gaussian  noise  of 
variance  0.1.  For  plotting  purposes,  where  maximum  likelihood  estimates  are 
presented  on  contour  plots,  Bm  =  0  solutions  are  set  equal  to  the  minimum 
non-zero  value,  0.0001. 

One  configuration  studied  consisted  of  three  spheres  with  diameters  in  inches 
of 0.4686,  0.2186,  and  0.25;  located  at  coordinates  (0, 12.15,0),  (-4.45,  -2.57,0), 
and  (7.28,-4.21,0),  respectively.  Coordinates  given  are  in  units  of  the  wave¬ 
length  of  the  illuminating  radiation,  which  was  2.14  mm. 

Two  realizations  of  noisy  ring  16  images  are  given  for  this  target  in  figures 
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Figure  4:  Realization  1:  Noisy  Ring  16  Image  of  3  spheres  using  a.  Conven¬ 
tional,  b.  Maximum  Likelihood  Processing. 


4  and  5.  Each  figure  contains  the  data  computed  directly  from  conventional 
and  maximum  likelihood  processing.  It  is  emphasized,  for  assistance  in  inter¬ 
pretation,  that  the  contour  plot  scales  are  associated  with  different  colors  that 
do  not  show  up  on  this  reproduction.  Thus,  although  the  conventional  imaging 
generally  finds  the  same  high  level  image  points  as  the  maximum  likelihood,  it 
usually  finds  a  few  more  higher  level  image  points  and  many  more  lower  level 
image  points.  This  just  means  that  the  image  uncertainty  implied  by  the  con¬ 
ventional  processing  is  not  as  bad  as  the  visual  appearance  implied  by  the  black 
and  white  display.  Even  with  this  consideration,  the  maximum  likelihood  ap¬ 
proach  does  leave  fewer  extraneous  image  points  and  consequently  a  computed 
image  having  less  apparent  uncertainty. 

In  the  sense  that  the  major  geometrical  features  of  the  sphere  target  can  be 
seen  in  some  of  the  single  ring  images  described  above,  the  sphere  target  seems  to 
be  well  modeled  by  the  independent  point  scatterer  models  which  underly  both 
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Figure  5:  Realization  2:  Noisy  Ring  16  Image  of  3  spheres  using  a.  Conven¬ 
tional,  b.  Maximum  Likelihood  Processing. 

the  conventional  and  maximum  likelihood  processing.  This  is  not  necessarily 
the  case  with  all  targets  studied. 

Consider  the  “RL"  target.  This  is  a  thin  aluminum  single  structure  83  mm 
high  by  63  mm  wide.  Its  shape  is  derived  from  the  form  of  the  solid  letters  “R” 
and  “L”,  with  the  “L”  part  to  the  right,  slightly  lower  than,  and  intersecting  the 
“R"  part.  A  conventionally  reconstructed  and  thresholded  image,  based  on  8 
rings  of  data  having  maximum  tilt  angle  a  of  about  5°  is  shown  in  figure  6a.  This 
thresholded  image  gives  the  pictorial  idea  of  the  target’s  structure.  Figure  6b 
shows  the  image  reconstructed  from  ring  8  of  data.  The  two  image  scales  are 
different.  Looking  at  the  images,  it  can  be  seen  that  all  of  the  recognizable 
features  in  the  8  ring  reconstruction  are  not  clearly  visible  in  the  single  ring 
reconstruction.  This  is  in  contrast  with  the  three  sphere  target.  This  difference 
may  be  due  to  either  level  differences  of  point  scatterers  under  the  assumed 
model  or  to  more  complicated  scatterering  mechanisms  requiring  a  different 
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Rings  1-8  Thresholded 


Ring  8  Thresholded 


Figure  6:  “RL”  Target  -  Conventional  Processing,  a.  Rings  1-8,  b.  Ring  8. 


model.  In  the  latter  case,  a  more  detailed  electromagnetic  based  model  may 
need  to  be  incorporated  into  the  imaging  process.  When  noise  was  added  to 
single  ring  data,  figure  7,  both  algorithms  needed  to  be  thresholded  to  produce 
an  image  resembling  figure  6b. 


4  Summary 

A  maximum  likelihood  based  approach  to  producing  images  from  noisy  radar 
data  has  been  described,  implemented  on  the  Rome  Laboratory  Vax  750  com¬ 
puter,  and  tested  using  measured  data  that  was  contaminated  by  simulated 
Gaussian  noise.  The  original  data  was  measured  at  the  Rome  Laboratory 
Prospect  Hill  Facility.  Results  show  that  with  relatively  small  amounts  of  data, 
useful  images  can  be  obtained. 

Possible  future  activity  could  proceed  along  at  least  two  lines.  First,  re¬ 
maining  to  be  analyzed  are  some  data  sets  from  wire-like  target  structures  and 
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Figure  7:  “RL”  Target  -  Ring  8  Image,  a.  Conventional,  b.  ML  Processing. 


from  a  dimpled  “RL”  target  structure  appearing  to  be  of  a  diffuse  nature.  It  is 
recalled  that  the  physical  process  modeled  in  both  types  of  processing  above  es¬ 
sentially  considers  targets  as  being  composed  of  independent  scatterers.  It  would 
be  valuable  to  extend  the  present  study  to  these  targets  or  perhaps  others  to 
see  how  the  model  works.  Another  possibility,  initially  along  more  theoretical 
lines,  would  be  to  extend  the  maximum  likelihood  approach  to  multiple-rings  of 
data  and/or  incorporate  more  detailed  scattering  mechanisms  into  the  imaging 
process. 
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and 

R.  David  Lankes 
Doctoral  Student 
School  of  Information  Studies 

Atenaci 

This  report  describes  our  efforts  this  summer  to  generate  a  method  for  translating  our  user- 

based  information  system  requirements  into  a  representation  form  that  would  be  readily 

interpretable  by  system  designers  and  system  analysts.  Typically,  the  kind  of  requirements 

specification  that  we  produce  from  our  user  requirements  analyses  arc  text-based  descriptions  of 

problem  solving  processes  as  perceived  by  a  group  of  users.  In  the  past,  these  text-based 

descriptions  have  proven  to  be  difficult  for  system  designers  and  analysts  to  interpret.  Since  our 

long  term  research  agenda  is  oriented  towards  large-scale  information  management  systems 

which  are  more  complex  than  traditional  applications,  we  felt  that  we  needed  some  systematic  and 

easily  interpretable  format  that  we  could  use  for  communicating  our  user  requirements.  Using  a 

combination  of  hypertext-like  representations  and  a  3-dimensional  virtual  reality  display,  we  have 

been  able  to  create  a  representation  system  that  not  only  provides  for  effective  interpretation  of 

our  user  requirements  on  the  part  of  system  designers  and  analysts,  but  the  virtual  reality  graphic 

» 

display  configuration  also  allows  us  to  represent  system  components  (e.g.,  display  devices, 
databases,  network  linkages,  etc.)  in  the  same  graphic  environment.  We  believe  that  this 
combination  of  hypertext  descriptions  and  virtual  reality  graphic  displays  will  facilitate  the 
accurate  representation  of  user  perspectives  in  large-scale  information  resource  management 
systems. 
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Ahsttafl 

Conducted  electromagnetic  (EM)  interference  measurements  and  analyses  were  performed  on  X-band 
Transmit/Receive  (T/R)  modules  built  by  Raytheon  and  Texas  Instruments.  The  T/R  module’s  Clock,  Mode1, 
+5  and  -7  volt  dc  supply  input  lines  and  the  Output  Built-in-Test  and  Evaluation  (OBITE)  line  were  evaluated. 
The  Clock  and  Mode  differential  input  pins  are  connected  within  the  T/R  module  to  a  CMOS  gate  array  through 
DS8820/7820  differential  line  receivers.  EM  interference  effects  were  simulated  using  PSPICE,  and  verified 
through  measurements,  to  determine  if  the  model  of  a  DS8820/7820  provides  accurate  EM  interference 
simulation  results  at  very  high  frequencies.  The  objectives  of  performing  measurements  and  simulations  on  the 
DS882Q  were  to  demonstrate  that  interference  effects  can  accurately  be  determined  on  simpler  devices  and 
models  prior  to  developing  more  complex  and  costly  products,  such  as  T/R  modules. 

Limited  simulations  were  also  performed  on  the  OBITE  driver  IC  (54ALS03  NAND  gate)  and  Power 
Condition  Monitoring  (PCM)  circuits  that  are  connected  to  the  +  5  and  -7  volt  dc  supply  lines.  The  PCM  circuits 
are  used  to  monitor  over-voltage  conditions  on  the  +5  supply  and  over-temperature  on  the  transmit  power 
amplifier  and  to  disable  receive  and  transmit  modes  in  the  event  of  over-voltage  or  over-temperature  conditions 
occur.  All  interference  effects,  with  the  exception  of  receiver  low  noise  amplifier  (LNA)  gain  compression,  could 
be  simulated.  Effects  that  were  duplicated  during  simulation  included  Mode  words  not  received  properly  by  the 
T/R  module,  and  the  T/R  module  receiver  LNA  cycling  off  and  on  with  the  application  and  removal  of 
interference  to  the  OBITE,  +5  and  -7  volt  supply  lines. 

Damage  effects  that  were  observed  while  performing  the  interference  raesurements  could  not  be 
simulated.  Two  T/R  modules  and  two  DS7820  IC’s  were  damaged  over  the  course  of  this  effort. 


1  The  Data  lines  were  not  tested  on  this  effort.  The  Mode  lines  are  used  to  send  commands  to  the  T/R 
module  to  place  the  module  in  transmit,  receive,  etc,  mode-of-operation,  hence  the  nomenclature  Mode.  The 
Data  lines  are  used  to  place  the  T/R  module  in  a  particular  state-of-operation  once  the  mode-of-operation  has 
been  selected.  The  default  state-of-operation  was  selected,  and  thus  the  reason  for  not  testing  the  Data  tines. 
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Abstract 

In  a  collection  of  physically  distinct  computers,  each  one  having  its 
own  dock,  it  is  often  necessary  to  know  how  each  dock  is  related  to  the 
others  and  how  that  relationship  changes  over  time.  This  relationship  is 
commonly  composed  of  an  initial  offset  and  a  drift  rote.  This  summer’s 
project  calculated  the  drift  rate  between  two  Encore  Multimaxes  and  two 
local  area  network  protocol  analyzers.  Some  of  the  result  are  graphed  that 
give  an  approximate  drift  rate,  but  no  strong  conclusions  were  reached  nor 
was  a  good  drift  rate  found. 
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1  Introduction 


In  a  collection  of  physically  distinct  computers,  each  one  having  its  own  clock, 
it  is  often  necessary  to  know  how  each  clock  is  related  to  the  others  and  how 
that  relationship  changes  over  time.  This  relationship  is  quantified  as  follows. 
At  some  initial  point,  two  clocks  differ  by  a  certain  number  of  seconds.  This 
initial  difference  is  known  as  the  offset.  Then,  after  t  seconds  have  elapsed, 
the  two  clocks  differ  by  an  amount  determined  by  a  linear  function  of  t.  This 
function  is  known  as  the  drift  rate.  These  two  components,  the  offset  and  the 
drift  rate,  are  used  to  quantify  the  relationship  between  clocks  on  physically 
distinct  computers. 

This  summer’s  experiment  included  calculating  a  drift  rate  between  three 
pairs  of  computers.  In  this  experiment,  there  is  one  computer,  denoted  “mmax,’’ 
that  acts  as  the  reference  computei .  The  intent  is  to  derive  a  drift  rate  for  each  of 
three  other  machines  relative  to  mmax.  The  three  other  machines  are  known  as 
machl,  npac ,  and  rome.  These  machines  will  be  called  the  local  machines  while 
mmax  will  be  known  as  i  he  global  machine  since  the  drift  rates  are  calculated 
with  respect  to  a  “global”  time  as  seen  by  the  mmax  machine. 

2  Network  Configuration 

For  this  experiment  a  simple  linear  network  is  used.  Starting  at  one  end  of  the 
network,  the  machine  machl  is  connected  to  npac  which  is  connected  to  rome 
which  is  connected  to  mmax.1 

3  Data  Collection 

To  calculate  the  three  drift  rates  for  the  machine  pairs  machl-mmax,  npac- 
mmax,  and  rome-mmax,  messages  were  sent  from  each  local  machine  to  the 
global  machine  (“mmax”)  every  hour  on  the  hour.  Moreover,  each  machine 
recorded  the  times  that  other  messages  passed  it  by.  For  example,  when  machl 
sent  a  message  M  to  mmax,  both  npac  and  rome  recorded  the  contents  of  M 
and  the  time  it  passed  by. 

Several  experiments  were  conducted.  Each  experiment  typically  lasted  50 
hours  or  more.  During  that  time,  each  local  machine  sent  messages  to  the 
global  every  hour  on  the  hour.  Additionally,  each  machine  recorded  the  passing 
of  messages.  Ail  the  times  in  these  recordings  were  written  with  respect  to  each 
machine’s  local  clock. 

After  each  experiment,  there  is  one  data  file  residing  at  each  machine  that 
contains  all  the  information  that  was  togged  during  the  experiment.  The  only 
exception  is  the  machl  machine;  it  logs  no  data. 
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4  Data  Alignment 

After  each  experiment  three  data  files  have  been  generated.  Each  entry  in  each 
data  file  corresponds  to  exactly  one  other  entry  in  each  of  the  other  two  data 
files.  At  this  point  all  the  data  files  for  all  the  experiments  (three  data  files  per 
experiment)  were  collected  into  a  central  repository. 

A  problem  with  all  the  collected  data  was  that  there  was  no  software  to 
access  the  data.  The  next  task  was  to  write  software  to  access  the  data.  This 
software  was  implemented  using  C++  ■ 

After  this  softare  was  written  and  all  the  data  were  fully  accessible,  it  was 
necessary  to  write  more  software  to  properly  “align”  the  data  in  the  three  data 
files  in  each  experiment.  Recall  that  each  entry  in  a  data  file  corresponded  to 
two  other  entries.  Unfortunately,  this  correspondence  is  not  concrete.  In  other 
words,  it  is  known  that  a  correspondence  exists ,  but  it  is  not  known  exactly 
which  entries  in  the  first  data  file  match  up  to  which  entries  in  the  second  and 
third  data  files. 

Given  this  problem,  more  software  was  written  to  generate  a  single  data 
file  with  the  corresponding  data  matched  up  from  the  three  original  data  files. 
The  main  algorithm  implemented  by  this  software  is  as  follows.  Every  message 
logged  to  a  data  file  has  a  timestamp  associated  with  it.  Each  timestamp  is 
rounded  to  the  closest  hour.  Then  messages  from  the  three  are  composed  using 
an  algorithm  similar  to  merge  sort.  The  three  data  files  are  scanned  looking  for 
messages  whose  timestamps  match. 

Finally,  the  three  data  files  generated  from  each  experiment  were  condensed 
to  a  single,  coherent  data  file.  It  was  from  this  data  file  that  the  real  calculcations 
could  be  performed. 

5  Calculations 

For  each  experiment  several  calculations  were  made  to  calculate  the  drift  rates 
between  napc-mmax  and  rome-mmax.  The  graphs  at  the  end  of  this  report 
plot  the  calculated  drift  rates  for  various  experiments  on  several  dates.  Graphs 
entitled  “dY  Calculation”  graph  drift  rates  for  rome-mmax,  and  graphs  entitled 
“dX  Calculation”  graph  drift  rates  for  npac-mmax. 

Note  that  most  of  the  calculated  drift  rates  hover  about  20  microseconds  per 
second. 


6  Conclusion 

Contrary  to  the  assumption  that  drift  rates  would  be  constant,  all  the  graphs 
show  a  different  drift  rate.  The  drift  rates  range  from  20  microseconds  per 
seconds  to  22  microseconds  per  second.  On  one  graph  the  drift  rate  is  actually 
changing  with  time. 
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In  conclusion,  the  resultant  drift  rates  were  not  what,  we  expected  and  of 
little  use.  Some  of  the  result  are  graphed  that  give  an  approximate  drift  rate, 
but  no  strong  conclusions  were  reached  nor  was  a  good  drift  rate  found. 
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Feb2627,  dY  Calculation,  all  steps 

Y  x  10'6 


10-7 


45.00 


X 


V 


24.00 


26.00 


28.00 


30.00 


step  2 
step  4 
*step  8 
step  16 
^tep  32 


Feb0506,  dX  Calculation,  all  steps 


Y  x  10*6 


10-11 


I 

\ 

\ 

\ 

I1  \ 

_ — _ M. _ _ _ 

\ 

A 

1  \ 

A 

I 

oj 

xl  7'*. 

l 

\  •  J-7  l**? 

\ 

V, 

t  1/  i 

iJ9 

In  tv  * 

11  1  «  / 

^  \ 

f  v  . 

,  •  |  i 

hi _ 

|  \  t 

II# 

1  1  # 

i  1  # 

1 

1  '  # 

I  • 

I 

I 

| 

step  1 
"step  2 
step  4 

*3tep  8 

step  16 

— *xr 

step  32 


30.00  35.00  40.00  45.00  50.00 


10-13 


step  1 

a . ----- - 

step  2 

*tep  4 

"step  8 

step  16 

"step  32 


LOO 


45.00 


X 


Mar3031,  dX  Calculation,  all  steps 


Y  x  10'6 


step  1 
step  2 
step  4 
step  8 
Ttep  16 
step  32 


X 


10-16 


N 
N 
K 

-0.00 
-10.00 
-20.00 
-30.00 
-40.00 
-50.00 
-60.00 
-70.00 
-80.00 
-90.00 
-100.00 

15.00  2 


Feb2627  (reduced  network  effects),  dX  Calculation,  all  steps 

Y  x  10'6 


10-18 


CENTRAL  ISSUES  IN  PERFORMANCE  EVALUATION  OF 
HETEROGENEOUS  DISTRIBUTED  COMPUTING  SYSTEMS  WITH  C3 

APPLICATIONS 


WALEED  W.  SMARI 

Department  of  Electrical  &  Computer  Engineering 


Syracuse  University 
121  Link  Hall 

Syracuse,  NY  13244-1240 


Final  Report  for: 

AFOSR  Summer  Research  Program 
Rome  Laboratories 


Sponsored  by: 

Air  Force  Office  of  Scientific  Research 
Bolling  Air  Force  Base,  Washington,  D.  C. 


September  1992 


11-  1 


CENTRAL  ISSUES  IN  PERFORMANCE  EVALUATION  OF 
HETEROGENEOUS  DISTRIBUTED  COMPUTING  SYSTEMS  WITH  C3 

APPLICATIONS 


WALEED  W.  SMARI 

Department  of  Electrical  &  Computer  Engineering 
Syracuse  University 
121  Link  Hall 

Syracuse,  NY  13244-1240 


ABSIRAC1 

The  principal  objective  of  this  research  is  to  contribute  to  the  efficient 
use  of  distributed  computing  systems  by  identifying  some  of  the  major 
issues  that  affect  the  performance  of  these  systems.  These  major  issues 
are  discussed  under  six  categories,  namely:  system  architecture  and 
environment,  workload  specification  and  characterization,  constraints, 
service  disciplines,  performance  measures,  and  finally  optimality  criteria 
and  strategies.  Additionally,  this  work  is  aimed  at  specifying  and 
surveying  some  key  problems  and  their  solutions  which  we  consider 
crucial  to  improving  the  performance  -  problems  such  as  control, 
partitioning  and  mapping,  scheduling,  synchronization,  memory  access, 
among  others. 

These  issues  and  problems  may  serve  as  a  crude  classification  scheme 
for  comparison  of  distributed  computing  systems. 
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CENTRAL  ISSUES  IN  PERFORMANCE  EVALUATION  OF  HETEROGENEOUS 
DISTRIBUTED  COMPUTING  SYSTEMS  WITH  C3  APPLICATIONS 


WALEEDW.  SMARI 


I.  mmODUCTIQ^; 

A  distributed  computing  system  is  a  collection  of  processor-memory  (hardware)  pairs  connected 
by  a  communications  network  and  logically  integrated  in  varying  degrees  by  a  distributed  operating 
system  and/or  distributed  database  system.  The  communications  network  may  be  a  wide 
(geographically  dispersed)  or  a  local  area  network. 

The  widespread  use  of  distributed  computing  in  command,  control,  and  communication  (C3) 
systems  is  due  to  many  factors.  Users  are  located  at  different  terminals  with  a  desire  to  communicate 
and  to  share  information  and  resources.  Information  generated  at  one  place  is  often  needed  at 
another.  A  Distributed  Computing  System  (DCS)  potentially  provides  significant  advantages  some 
of  which  are  performance  enhancement,  reliability  improvements,  availability,  resource  sharing  (both 
hardware  and  software  resources),  scalability,  and  expandability.  This  gives  the  system  the  ability  to 
easily  adapt  to  short-term  (varying  workloads,  failures,  network  traffics,  etc.)  as  well  as  long-term 
(major  modifications)  changes  without  significant  disruption  of  the  system.  Distributed  computing 
systems  can  provide  the  necessary  power  to  meet  the  growing  demand  of  the  users  community  which 
is  growing  faster  than  the  advances  in  devices  alone  can  supply.  In  the  past,  a  major  deterrent  to  the 
distributed  approach  has  been  cost  However,  the  cost  of  hardware  is  generally  going  down  and  cost 
effective  and  efficient  computer/communications  systems  are  feasible. 

These  distributed  computing  systems  (along  with  large  distributed  databases,  distributed 
processing  and  distributed  communication  networks)  have  given  rise  to  some  very  complex  structures 
with  very  complex  problems.  Let  us  consider,  for  instance,  the  issue  of  computation  speedup.  The 
methods  for  speeding  up  the  computation  may  include  the  following:  1)  faster  chips  -  a  physics  and 
engineering  problem,  2)  architectures  that  permit  concurrent  processing  -  a  system  design  problem,  3) 
compilers  for  detecting  concurrency  -  a  software  engineering  problem,  4)  algorithms  for  specification 
of  concurrency  -  a  language  problem,  and  5)  models  of  computation  -  an  analytic  problem. 

Various  distributed  computing  systems  have  been  proposed,  and  many  have  been  built  Although 
technology  facilitates  the  construction  of  these  systems,  the  way  of  reaching  their  full  potentials  still 
needs  more  exploration.  It  is  clear  from  the  above  discussion  that  we  need  to  learn  how  to  think 
about  these  systems  properly  and  how  to  assess  and  evaluate  their  performance. 
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n.  FUNDAMENTAL  ISSUES  AND  PROBLEMS 

In  this  section  we  introduce  the  primary  issues  that  determine  or  influence  the  performance  of  any 
distributed  computing  system.  We  also  discuss  key  problems  and  their  proposed  or  implemented 
solutions  -  problems  such  as  control,  partitioning,  scheduling,  synchronization,  memory  access,  etc. 

A.  System  Architecture  and  Environment  Issues 

This  may  include: 

1.  The  Number  of  Machines  Available: 

The  simplest  case  we  can  start  with  is  that  of  a  single  machine  system.  For  the  more  practical  case 
of  multiple  machine  systems,  we  classify  them  as  either  parallel  machines  or  shop  systems.  Parallel 
machine  systems  assume  the  situation  where  each  job  needs  only  a  single  operation  for  its  completion 
while  the  machines  operate  in  parallel.  If  each  job  requires  multiple  operations  and  the  machines  are 
used  in  series,  then  we  refer  to  these  systems  as  shops.  In  this  case,  each  job  needs  execution  on 
more  than  one  machine.  Three  common  categories  of  shop  problems  can  be  distinguished: 

i.  Open  Shop  Problems:  where  each  job  Jj  consists  of  a  set  of  m  operations  {Oij,  02j, ..., 

Omj}.  The  order  in  which  a  job  passes  through  the  machines  (or  the  operations  are 

executed)  is  immaterial. 

ii.  Flow  Shop  Problems:  where  each  job  Jj  consists  of  a  chain  of  m  operations  (Oij, ...,  Omj). 

Each  job  in  the  set  of  jobs  has  the  same  machine  ordering. 

iii.  Job  Shop  Problems:  where  each  job  Jj  consists  of  a  chain  of  m  operations  (Oij, ...,  Omj). 

Each  operation  has  to  be  processed  on  a  given  set  of  machines.  That  is,  a  particular 

machine  ordering  is  specified  for  each  individual  job. 

In  the  more  general  case  of  distributed  systems,  regardless  of  the  point  of  view  (whether 
hardware  or  software),  a  system  is  actually  a  combination  of  both  parallel  processors  and  shops. 

2.  The  Types  of  Machines  in  the  System: 

Machine  types  may  be  classified  as: 

i.  Identical  machines:  where  all  machines  have  the  same  speed  and  capability  in  processing 

any  job. 

ii.  Uniform  (Heterogeneous)  Machines:  where  each  machine  has  a  specific  speed  of  processing 

regardless  of  job  types.  Yet,  each  machine  executes  all  jobs  at  the  same  speed. 

iii.  Unrelated  Machines:  where  no  particular  relationship  between  the  processing  speeds  of 

different  machines  (with  different  jobs)  exists.  That  is,  processing  speeds  are  arbitrary. 

3.  Architecture  Configuration: 

The  architecture  as  well  as  the  organization  of  the  system  greatly  influence  the  kind  of 
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performance  modeling  and  analysis  used.  Therefore,  it  is  important  to  define  the  details  of  the 
system's  structure  and  components  such  as  memories,  processing  units,  control,  interconnection 
network,  and  so  on. 

Many  computer  architectures  and  organizations  have  been  proposed  and/or  implemented.  These 
may  be  distributed  or  centralized  systems,  pipelined  systems,  anray  systems,  multiprocessor  systems, 
multicomputer  systems,  adaptable  architectures  such  as  reconfigurable  systems  or  dynamic  computer 
systems,  data-flow  architectures,  etc.  Many  classifications  and  taxonomies  of  computing  systems 
architectures  have  been  published  based  on  the  way  these  systems  are  viewed.  The  most  celebrated 
ones  were  proposed  by  Flynn  [72],  Anderson  and  Jensen  [75],  Enslow  [77],  Jones  and  Schwarz 
[80],  and  Duncan  [90]. 

As  for  components,  let  us  consider  for  example,  the  issue  of  memories.  Depending  on  the 

organization  of  the  memories,  multiple-processor  systems  can  be  classified  as  centralized,  distributed, 

and  mixed-memory  multiprocessors.  In  a  centralized-memory  multiprocessor,  all  memory 

modules  are  equally  accessible  to  all  processors.  A  processor-memory  interconnection  network  is 

needed  to  allow  all  processors  in  the  system  to  access  the  memory  modules.  Typical  examples  are  the 

Encore's  Multimax  and  the  CRAY-X/MP.  Several  hardware  bottlenecks  exist  in  such  a  system. 

These  are  functions  of  the  number  of  processors,  the  memory  bandwidth,  and  the  bandwidth  of  the 

interconnection  network  used.  In  a  distributed-memory  multiprocessor,  each  memory  module  is 

physically  associated  with  a  processor.  No  memory  is  globally  accessible.  An  interconnection 

network  is  needed  to  allow  the  processors  to  communicate  with  each  other.  Because  each  memory 

module  is  attached  to  a  corresponding  individual  processor,  the  performance  of  an  algorithm  depends 

on  how  well  the  application  problem  is  partitioned  and  mapped  into  the  processors.  The  main 

overhead  in  such  a  system  is  the  interprocessor  communications.  A  multiprocessor  system  may  have 

a  mixed-memory  structure  to  provide  a  local  memory  to  each  processor  and  global  memory 

* 

modules  shared  by  all  processors.  An  example  of  such  system  is  the  Camegie-Mellon  Cm 
multiprocessor  (Gehringer  et  al  [87]). 

B.  Workload  Specification  &  Characterization 

By  workload  we  roughly  mean  the  set  of  all  inputs  (programs,  data,  commands)  that  a  computing 
facility  receives  from  its  users.  In  any  evaluation  study,  we  have  to  decide  under  what  workload  the 
performance  of  the  system  should  be  evaluated.  Performance  measures  are  meaningful  only  if  the 
workload  by  which  their  values  are  produced  is  precisely  specified. 

Two  workload-related  issues  need  to  be  considered  during  the  planning  phase  of  an  evaluation 
study.  One  is  the  specification  of  the  type  of  workload  which  is  to  drive  the  system  (or  a  model  of  the 
system).  The  second  is  the  characterization  of  the  workload  for  the  system  under  study. 
Characterizing  a  workload  for  evaluation  purposes  requires  determining  which  of  its  numerous 
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aspects  have  an  influence  on  the  system’s  performance.  An  actual  workload  can  be  characterized  with 
various  degrees  of  detail  using  a  workload  model.  The  workload  model  is  to  be  formulated, 
constructed,  tested,  calibrated,  and  validated.  In  a  way,  this  model  can  be  viewed  as  a  set  of 
quantifiable  parameters. 

Let  us  discuss  next  some  of  the  parameters  that  may  be  used  to  characterize  a  workload. 

1.  Job  Description: 

Any  actual  workload  can  be  regarded  as  consisting  of  a  set  of  jobs,  each  one  of  which  performs 
an  information-processing  task  when  it  is  processed  by  the  system.  Each  job  that  is  to  be  executed  on 
a  given  system  is  characterized  by  certain  parameters.  These  parameters  may  be  deterministic  or 
stochastic.  They  may  include: 

*  Number  of  operations; 

*  One  or  more  processing  times,  Py,  that  the  job  Jj  has  to  spend  on  machine  Mj  on  which  it 
requires  processing; 

*  A  due  date  or  deadline,  dj  >  0,  by  which  the  job  Jj  should  ideally  be  completed; 

*  A  release  (or  ready  or  arrival)  date,  rj  £  0,  on  which  the  job  Jj  becomes  available  for  processing; 

*  A  weight,  wj  >  0,  indicating  the  relative  importance  of  the  job  Jj; 

*  A  completion  time,  Cj  >  0,  by  which  the  job  Jj  is  actually  completed; 

*  A  nondecreasing  real  cost  function  to  measure  the  cost  incurred  if  the  job  is  completed  at  time  t 

*  An  allowance  time,  aj,  which  is  the  period  allowed  for  processing  between  the  ready  time  and 
the  due  date:  aj  =  dj  -  rj. 


Fig.  1  Time  Quantities  of  a  Job 
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Various  other  terms  associated  with  a  job  Jj  may  be  defined  as  following: 
Flowtime:  Fj  =  Cj  -  rj 

Waitingtime:  Wj  =  Fj  -  Py 


Lateness:  Lj =  Cj  ■  dj  —  Fj  -  aj  —  Cj  -  rj  -  aj 

Earliness:  Ej  =  Max  [0,  dj  -  Cj]  =  Max{0,  -Lj} 

Tardiness  Tj  =  Max  [0,  Cj- dj]  =  Max{0,  Lj} 


Tardiness  indicator  = 


{1  if  Tj  >  0 
0ifTj  =  0 


2.  Job  Behavior: 

As  was  mentioned  before,  the  nature  and  structure  of  the  workload  affect  to  a  great  deal  the 
performance  of  the  computing  system.  For  example,  in  some  computing  facilities,  as  in  hard  real¬ 
time  systems,  there  are  two  types  of  tasks:  nonperiodic  and  periodic  tasks.  A  nonperiodic  task  has 
arbitrary  arrival  time.  A  periodic  task  with  period  P  requires  that  one  instance  of  the  task  would  be 
executed  once  every  P  units  of  time  after  system  initialization.  The  deadline  for  each  request  of  a  task 
can  be  no  later  than  the  initiation  of  the  next  request  of  the  same  task,  i.e.  no  later  than  P.  Such  is  the 
case  in  a  satellite  tracking  system  where,  in  order  to  have  the  antenna  continuously  aimed  at  a  satellite, 
the  system  must  process  periodic  requests  for  adjusting  the  aiming  of  the  antenna.  Each  request  for 
adjustment  must  be  completely  processed  by  the  tracking  system  within  a  limited  time  with  respect  to 
the  previous  adjustment  of  the  antenna.  In  this  case,  the  tasks  are  periodic  and  with  hard-real-time 
requirements. 

Jobs  may  be  in  various  states  of  processing  at  any  given  time.  The  following  terms  account  for 
this: 

Nw(t)  =  the  number  of  jobs  waiting  or  not  ready  for  processing  at  time  t 
Np(t)  =  the  number  of  jobs  actually  being  processed  at  time  t 
Nc(t)  =  the  number  of  jobs  completed  by  time  t. 

N„(t)  =  the  number  of  jobs  still  to  be  completed  by  time  t 
Therefore:  Nu(0)=  n  ,  N^Cmax)52  0,  and  at  any  time  t, 

Nw(t)  +  Np(t)  +  Nc(t)  =  n 
Nw(t)+Np(t)  =Nu(t) 

Generally  speaking,  when  using  a  distributed  computing  system,  the  workload  of  the  system 
differs  in  a  variety  of  ways  from  the  centralized  system.  However,  very  little  data  is  available  to 
conduct  a  quantitative  analysis  of  the  workload  of  these  systems. 


11-7 


c.  lass  sL  Constraints 

In  any  computing  system,  one  or  more  constraints  may  be  present.  The  two  principal  categories 
are:  constraints  in  the  system  and  constraints  in  the  tasks  to  be  executed.  These  two  may  additionally 
be  classified  as  constraints  in  time,  priority,  precedence,  resources,  and  interprocessor 
communication. 

1.  Time  Constraints: 

A  time  constraint  affects  the  execution  timing  of  an  operation.  A  typical  such  constraint  may  be  a 
due  date  which  imposes  restrictions  on  the  latest  allowable  completion  time.  Another  example  is  a 
release  time  constraint  which  specifies  the  earliest  starting  time  for  processing  of  the  job. 

Timing  constraints  are  especially  important  in  a  real-time  system.  Such  a  system  violates  a  timing 
constraint  if  any  job  misses  a  deadline.  Depending  on  how  much  it  tolerates  violations  of  timing 
constraints,  a  real-time  system  can  be  classified  as:  "hard"  or  "soft".  A  hard  real-time  system 
cannot  tolerate  a  single  violation  of  a  timing  constraint.  A  task  is  considered  to  be  of  value  only  if  it 
finishes  before  its  deadline.  In  contrast,  a  soft  real-time  system  tolerates  timing  violations  and 
allows  jobs  to  be  prioritized  according  to  their  contribution  to  the  continuing  function  of  the  system 
while  minimizing  the  total  number  of  deadline  misses.  Performance,  in  this  case,  is  characterized  by 
the  extent  to  which  jobs  miss  deadlines.  In  effect,  jobs  have  to  be  executed  as  quickly  as  possible, 
but  there  is  no  explicit  timing  constraint  associated  with  them. 

2.  Priorities: 

In  a  priority-driven  environment,  the  active  task  with  the  highest  priority  will  be  processed  first. 
An  active  task  is  a  task  that  has  requested  processing  but  has  not  yet  received  an  amount  of  processor 
time  equal  to  its  run-time.  A  priority-driven  algorithm  is  characterized  by  the  manner  it  assigns 
priorities  to  tasks.  The  assignment  can  be  carried  out  either  statically  or  dynamically.  If  fixed 
priorities  are  assigned  to  tasks  once  and  for  all,  then  the  priority  assignment  is  said  to  be  static.  A 
fixed  priority  list  of  jobs  can  be  constructed  by  sorting  the  tasks  in  decreasing  order  of  their  priorities. 
At  any  time  t,  when  a  request  for  processing  is  initiated  or  when  the  processing  of  a  task  is 
completed,  the  priority  list  will  be  scanned  from  the  top  to  find  the  first  active  task  in  it  Once  it  is 
located,  this  highest  priority  active  task  will  be  processed  next 

On  the  other  hand,  if  priorities  of  tasks  can  change  from  request  to  request  then  the  priority 
assignment  is  said  to  be  dynamic.  The  implementation  of  a  dynamic  priority-driven  algorithm  is 
similar  to  that  of  a  static  one  except  for  the  use  of  a  dynamic  priority  list.  At  any  time  t  the  first  task 
in  the  list  Jj,  will  be  processed.  When  Jj  is  completed,  it  will  be  removed  and  the  new  first  task  in 
the  list  will  be  processed.  If  during  the  processing  of  Jj  a  new  request  for  a  task  Jk  is  initiated,  then 
task  Jk  will  be  inserted  into  the  list  by  comparing  its  priority  with  the  priorities  of  the  tasks  that  are 
already  in  the  list.  If  Jk  is  the  new  first  task  in  the  list,  then  it  will  be  processed  immediately  when 
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preemption  is  allowed;  otherwise,  the  processing  of  Jj  will  continue. 

A  typical  priority-driven  algorithm  is  the  earliest-deadline-first  algorithm. 

3.  Resource  Constraints: 

A  multiprocessor  system  has  various  types  of  resources,  say  r  types.  These  resources  may  be 
different  kinds  of  processors,  memory  modules,  buses,  and  so  on,  with  distinct  characteristics.  We 
specify  the  resource  requirements  of  job  Ji,  by  a  function  R(Jj)=  (Rn,  R&, ...,  Rir),  with  Rjy  being 
the  number  of  units  of  the  y-th  resource  that  is  needed  by  task  Jj  during  its  execution.  These 
resources  in  R(Jj)  are  assumed  to  be  used  simultaneously  during  the  execution  of  job  Ji. 

If  we  denote  the  number  of  units  of  the  y-th  resource  type  in  the  system  by  Ry  >  0, l<.  y  <.  r,  with 
Ry  =  %  Riy,  1$  i  £n,  then  the  total  resource  availability  in  the  system  will  be  specified  by  the  vector 
R=  {Ri,  R2, ....  Rr}.  The  existence  of  resource  constraints  require  that  the  total  number  of  resources 
of  the  various  types  which  are  needed  by  the  jobs  being  processed  at  any  given  instant  of  time,  do  not 
exceed  the  total  available  amount  of  resources  as  specified  in  the  vector  R. 

4.  Precedence  Constraints: 

In  many  situations  it  is  assumed  that  tasks  are  independent  in  the  sense  that  the  processing  of  a 
task  will  not  depend  on  the  processing  of  other  task(s).  This  means  that  all  necessary  information  and 
data  required  for  the  processing  of  the  tasks  are  self-contained.  The  independence  assumption  greatly 
reduces  the  complexity  of  the  problem 

However,  in  many  applications,  restrictions  on  the  order  in  which  tasks  can  be  executed  arise 
naturally.  Consider  two  tasks  Jj  and  Jj  with  the  property  that  the  execution  of  Jj  will  require  some 
information  from  the  execution  of  Jj.  This  means  that  Jj  must  first  be  executed  to  generate  the 
necessary  data  for  Jj.  Consequently,  each  processing  of  Jj  must  be  preceded  by  a  complete  execution 
of  Jj.  Normally,  this  kind  of  precedence  relation  between  two  tasks  extends  to  a  larger  set  of  tasks. 

The  precedence  relation  is  a  nonempty  partial  ordering  relation  defined  on  the  set  of  tasks.  The 
partial  ordering  relation  is  modeled  by  a  directed  acyclic  graph  (DAG),  G.  Each  task  is  represented 
by  a  vertex  of  G.  A  directed  edge  from  Jj  to  Jj  indicates  that  Jj  must  be  processed  before  Jj.  Job  Jj  is 
said  to  be  a  predecessor  of  Jj  and  Jj  is  said  to  be  a  successor  of  Jj.  Precedence  constraints  can  also  be 
expressed  as  an  ordered  pair  (Jj,  Jj)  €  R,  where  R  is  a  precedence  relation  which  lists  all  allowable 
orderings  between  jobs  Jj  and  Jj. 

In  scheduling  a  given  task,  the  precedence  constraints  induced  by  the  partial  ordering  relation  on 
the  set  of  tasks  must  be  satisfied.  If  a  predecessor  of  job  Ji  has  not  been  completely  executed,  then 
the  processing  of  Jj  must  be  delayed  and  the  task  is  said  to  be  blocked.  When  all  the  predecessors 
of  Jj  have  been  completely  executed,  the  processing  of  Jj  can  be  started  as  soon  as  scheduled,  and  Jj 
is  said  to  be  unblocked.  Thus,  at  any  instant,  task  Jj  can  be  classified  as  being  in  one  of  the 
following  four  states: 
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i.  Inactive  state:  the  current  request  for  Ji  has  already  been  completed  and  Jj  is  waiting  for  its 

next  request  to  be  initiated; 

ii.  Ready  state:  a  request  for  Ji  has  been  initiated  and  the  task  is  waiting  to  be  unblocked; 

iii.  Active  state:  a  request  for  Jj  has  been  initiated  and  the  task  is  unblocked; 

iv.  Executing  state:  task  Jj  is  currently  being  processed. 

A  precedence  relation  may  be  tree-like,  forest-like,  or  arbitrary:  any  non-cyclic  precedence 
relation. 

5.  Synchronization  &  Communication  Issues: 

From  the  viewpoint  of  processes,  there  are  two  basic  process  synchronization  and  communication 
models; 

i.  The  shared-memory  model  in  which  the  system  has  a  global  memory  accessible  by  all  processors. 

The  processes  communicate  through  shared  variables.  In  such  a  system,  the  access  time  to  a  unit 
of  data  is  the  same  for  all  processors.  A  hardware  device  or  a  software  protocol  is  required  in 
such  systems  for  arbitrating  the  access  to  the  memory  among  the  processes  sharing  it.  The 
shared-memory  may  cause  a  software  bottleneck. 

ii.  The  message-passing  model  in  which  processors  communicate  by  passing  messages  explicitly 

through  an  interprocessor  communication  network.  The  performance  of  an  algorithm  in  these 
systems  depends  on  how  well  the  application  problem  is  partitioned  and  mapped  into  the 
processors,  and  on  the  efficiency  of  the  communication  mechanism.  A  distributed-memory 
multiprocessor  system  does  not  have  the  software  bottleneck  as  in  the  case  of  a  shared-memory 
system;  however,  it  does  experience  interprocessor  communication  problems. 

The  interprocessor  communication  mechanism  and  the  computational  power  of  the  individual 
processor  are  two  of  the  major  factors  which  affect  the  performance  of  the  system.  In  order  to  fully 
explore  the  power  of  a  distributed-memory  multiprocessor  system,  there  must  be  a  balance  between 
computation  and  communication.  Shih  and  Fier  [87]  suggest  that  an  ideal  ratio  of  1:1  between 
computation  and  communication  should  be  achieved.  It  has  been  shown  though  (Dunnigan  [87], 
Heath  [87]),  that  the  communication/computation  ratios  of  some  of  the  present  distributed-memory 
systems  are  very  high  (>  10).  This  indicates  that  the  communication  mechanism  provided  in  these 
machines  does  not  match  the  speed  of  powerful  processors  and  thus  becomes  the  major  bottleneck  of 
the  system  performance.  This  problem  becomes  more  critical  as  processors  continue  to  become  faster 
and  more  powerful.  Therefore,  in  order  to  improve  the  overall  system  performance,  the 
communication  overhead  must  be  significantly  reduced,  and  efficient  methods  of  handling  all  types  of 
communication  should  be  introduced. 

Information  items  communicating  in  multiprocessors  include  synchronization  primitives,  status 
semaphores,  interrupt  signals,  variable-size  messages,  shared  variables,  and  data  values. 

Three  basic  communication  strategies  are  in  use: 
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(i)  Unicast  (One-to-One)  communication  is  the  sending  of  a  message  from  a  source  node  to  one 

destination  node.  This  type  of  communication  is  directly  supported  by  all  distributed-memory 
multiprocessors. 

(ii)  Broadcast  (One-to-all)  communication  is  a  type  of  information  exchange  in  which  a  source  node 

wishes  to  send  a  message  to  all  other  nodes  in  the  system  (Ho  &  Johnson  [86]). 

(iii)  Multicast  (One-to-Many)  communication  where  a  node  wants  to  send  the  same  message  to  k 

other  nodes. 

To  send  messages  to  the  right  destination  at  the  right  time  is  very  important  in  many  applications, 
since  a  node  may  have  to  await  a  message  from  some  other  nodes  before  continuing  its  computation. 
Therefore,  it  is  desirable  that 

(a)  each  individual  destination  receives  the  message  through  a  shortest  path;  and 

(b)  the  number  of  intermediate  nodes  required  to  deliver  the  source  message  to  all  destinations  is 
as  small  as  possible,  in  order  to  reduce  the  traffic  created  by  the  multicast  communication  in  the 
network  (Lan  et  al  [88a,  88b]). 

When  a  message  delivery  between  non-neighboring  nodes  occurs,  the  message  has  to  be 
forwarded  through  some  intermediate  nodes.  Two  transport  mechanisms  are  currently  used: 

(1)  Circuit  switching:  where  a  physical  communication  path  between  the  source  and  the  destination 

has  to  be  established  first  Then,  the  source  can  send  out  the  message  to  the  destination.  The 
routing  overhead  is  paid  only  once  at  the  circuit  set-up  time.  Also,  no  link  along  the  established 
path  can  be  shared  by  another  message  delivery. 

(2)  Packet  switching:  No  physical  path  is  established  before  the  starting  of  a  communication.  A 
message  is  decomposed  into  packets  which  are  sent  out  individually.  The  source  determines  its 
output  link(s)  and  sends  the  message  to  the  neighboring  node(s).  Then,  each  of  the  nodes  which 
receives  the  message  will  decide  its  output  link(s)  for  further  forwarding,  and  so  on.  One  link  is 
requested  at  a  time,  and  released  immediately  after  it  is  used.  Routing  decisions  need  to  be  made 
for  each  packet.  A  buffer  is  needed  at  each  node  to  temporarily  store  the  message.  The 
efficiency  of  the  communication  depends  on  the  strategy  for  making  the  routing  decisions  as  well 
as  the  size  of  the  packets. 

When  message  passing  is  used  as  the  means  for  interprocessor  communication,  if  a  node  wants  to 
send  a  message  to  a  neighboring  node,  the  message  delivery  is  relatively  simple.  However,  if  a  node 
wants  to  send  a  message  to  a  distant  node,  the  message  has  to  traverse  through  some  intermediate 
nodes.  To  send  a  message  from  a  node  to  several  other  nodes,  the  situation  becomes  more 
complicated.  One  of  the  major  problems  in  interprocessor  communication  is  to  determine  which 
pafh(s)  should  be  used  to  deliver  a  message  from  a  source  node  to  some  destination  node(s),  i.e. 
message  routing.  Message  routing  techniques  may  be  classified  as  centralized  or  distributed.  In 
centralized  routing ,  the  source  node  determines  all  the  intermediate  nodes  for  message  delivery.  The 
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addresses  of  all  intermediate  nodes  must  be  tagged  onto  the  message,  and  hence  a  particular  path  is 
specified  for  each  of  the  destinations.  This  will  create  extra  communication  overhead,  especially  in 
the  case  of  multi-destination  message  routing.  In  distributed  routing  the  source  node  and  each  node 
involved  in  message  forwarding  specifies  only  which  of  its  neighboring  node(s)  are  to  be  involved  in 
the  message  delivery. 

D.  Service  Disciplines 

1.  General  Overview 

By  service  disciplines  we  mean  the  order  in  which  the  arriving  tasks  are  handled  at  a  processing 
unit. 

There  are  two  basic  schemes.  If  the  service  discipline  allows  jobs  to  be  interrupted  by  new 
arrivals,  we  speak  of  a  service  discipline  allowing  preemption.  If  such  an  interruption  is  not 
allowed,  the  service  discipline  is  referred  to  as  non-preemptive.  In  the  case  of  preemption,  if 
processing  of  the  interrupted  job  can  resume  where  it  left  off  without  loss  of  information,  it  is  a 
preemptive-resume  service  discipline;  if  the  job  has  to  be  started  again,  we  speak  of  a  preemptive- 
repeat  service  discipline. 

Numerous  service  policies  are  discussed  in  the  literature.  The  most  common  and  simplest  service 
discipline  is  the  First-Come -First-Served  (FCFS),  where  the  jobs  are  handled  in  their  order  of 
arrival.  It  is  a  policy  that  favors  the  longest  waiting  job  irrespective  of  the  amount  of  service  time 
demanded  by  it  Average  turnaround  time  and  average  waiting  time  of  jobs  are  generally  not  minimal, 
especially  for  short  jobs.  The  policy  is  also  unfair  in  that  unimportant  jobs  make  important  jobs  wait 
Another  common  service  discipline  is  the  Shortest-Job-First  (SJF).  Here,  associated  with  each  job  is 
the  length  of  its  burst  (estimated  service  time).  The  job  with  the  shortest  burst  is  the  next  one  to 
receive  service.  This  policy  minimizes  the  average  waiting  time  of  a  task.  The  main  difficulty  with 
SJF  is  that  it  can  effectively  prevent  jobs  that  require  long  time  from  receiving  service.  Preemptive 
SJF  is  called  Shortest-Remaining-Time  First  (SRT).  SRT  will  preempt  the  currently  executing  job 
and  start  executing  a  new  job  just  arrived  with  a  shorter  burst  than  what  is  left  of  the  currently 
executing  one.  An  algorithm  which  is  a  balance  between  FCFS  and  SJF  extremes  is  the  Highest 
Waiting  Ratio  Next  (HWRN).  It  gives  fairly  quick  response  to  the  short  jobs,  but  also  limits  the 
waiting  time  of  longer  jobs  (aging).  The  service  discipline  which  handles  its  jobs  in  the  reversed 
order  of  arrival  is  referred  to  as  the  Last-Come-First-Served  (LCFS).  LCFS  can  be  applied  to 
preemptive  scheduling,  so  that  the  jobs  in  service  are  interrupted  by  each  new  arrival. 

If  there  are  groups  of  jobs  at  a  processing  unit,  and  one  group  has  priority  in  accessing  the  servers 
over  another  group,  then  we  have  a  priority  service  discipline.  The  servers  are  allocated  to  the  tasks 
with  the  highest  priority.  SJF  can  be  regarded  as  a  special  case  of  priority  scheduling.  Priorities  can 
be  defined  externally  (for  example:  the  kind  of  account  the  user  has)  or  internally  (such  as  time  limits 
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or  memory  requirements  •  •  •  etc.). 

Another  common  scheduling  algorithm,  which  is  preemptive,  is  Round  Robin  (RR).  Here,  the 
processor's  time  is  equally  divided  over  all  jobs  in  the  queue.  This  is  done  by  dividing  the 
processor's  service  into  quan turns  of  time  with  each  quantum  =  1/N  units,  where  N  is  the  number  of 
jobs  in  the  queue.  When  a  job  has  used  up  its  time-share,  it  will  be  preempted  and  added  to  the  end 
of  the  ready  queue,  while  the  next  job  is  placed  at  the  beginning  of  the  queue.  This  exchange  of  jobs 
is  called  swapping,  which  is  the  main  source  of  overhead  in  RR  and  in  preemptive  policies  in 
general.  In  its  extreme  cases:  if  the  quantum  is  -»  then  we  have  the  FCFS  discipline;  if  on  the 
other  hand  the  quantum  is  — >  0+  (e.g.  executes  only  one  instruction)  then  we  have  processor  sharing 
(PS).  Processor  sharing  is  not  practically  feasible;  nonetheless,  it  is  used  in  system  analysis  to 
approximate  complex  situations  under  specific  assumptions.  Essentially,  PS  behaves  as  if  each  of  the 
n  jobs  has  its  own  processor  running  at  1/n  the  speed  of  the  real  processor. 

There  are  other  scheduling  policies  used  in  such  cases  as  bus  arbitration  or  page  replacement  ir. 
virtual  storage  management  For  instance,  the  Least-Recently-Used  (LRU)  strategy  gives  the  highest 
priority  to  the  requesting  processor  that  has  not  used  the  bus  for  the  longest  time.  This  assumes  that 
past  behavior  is  a  good  indicator  of  the  future.  LRU  requires  a  substantial  overhead  and  hardware 
assistance  since  priorities  must  be  reassigned  after  each  bus  cycle.  Strategies  that  approximate  LRU 
are  available  which  have  lower  overhead  and  hardware  requirements. 

Another  class  of  service  disciplines  has  been  introduced  for  situations  in  which  jobs  are  easily 
classified  into  different  groups  based  on  some  criterion  such  as  different  response  time  requirements. 
A  multi-queue  scheme  partitions  the  ready  queue  into  separate  queues  (System  tasks.  Interactive, 
batch,  and  so  on),  and  each  job  is  assigned  once  and  for  all  to  one  of  the  queues  based  on  some 
property  of  the  job.  Each  queue  has  its  own  scheduling  policy.  In  addition,  there  is  a  scheduling 
policy  between  the  queues. 

In  multi-level  feedback  queues,  each  job  is  allowed  to  move  between  queues.  A  new 
process  initially  enters  at  the  back  of  the  top  queue.  It  then  moves  through  the  queue  according  to  the 
queue  service  discipline  ,  until  it  gets  served.  If  the  job  is  completed,  or  it  is  waiting  for  I/O 
completion,  then  the  job  leaves  the  queue.  If  the  quantum  expires  before  the  job  finishes  execution, 
the  job  will  be  placed  at  the  back  of  the  next  lower  level  queue.  The  job  is  next  serviced  when  it 
reaches  the  head  of  that  queue,  if  the  first  queue  is  empty.  As  long  as  the  job  continues  using  the  full 
quantum  provided  at  each  level,  it  continues  to  move  to  the  next  lower-level  queue.  In  general,  this 
system  is  defined  by  several  parameters:  the  number  of  queues,  the  service  policy  of  each  queue,  a 
method  of  determining  when  to  upgrade  (demote)  a  job  to  a  higher  (lower)  priority  queue,  a  method 
of  determining  which  queue  a  job  will  enter  when  it  needs  service.  Finally,  although  a  multi-level 
feedback  queue  is  the  most  general  scheme,  it  is  also  the  most  complicated  one.  It  can  be  configured 
to  match  a  specific  system  under  design. 
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2.  Static  vs.  Dynamic  Scheduling: 

There  are  two  primary  approaches  to  parallel/distributed  processing  and  scheduling  of  tasks  on  a 
multiple  processors/computers  systems: 

i.  Static  Scheduling  which  assumes  that  all  parallel  tasks  in  a  computation  are  known  apriori 
(that  is,  complete  knowledge  of  tasks’  characteristics  at  the  system  initiation  time  is  required).  They 
are  statically  mapped  onto  processors  before  the  computation  is  initiated,  and  remain  there  throughout 
the  entire  computation.  The  mapping  can  be  done  off-line  on  an  auxiliary  scheduling  processor, 
eliminating  many  of  the  information  acquisition  problems  faced  when  scheduling.  In  such  a 
paradigm,  two  different  software  design  styles  are  used: 

a)  A  universal  task  design  style,  as  with  the  uniform  grid  used  for  solving  partial  differential 
equations,  where  each  of  the  grid  blocks  is  assigned  to  a  task  (Ortega  &  Voigt  [85]);  and 

b)  A  network  of  communicating  tasks,  where  the  topology  can  be  arbitrary  and  irregular  and  both  the 
amount  and  frequency  of  intertask  data  transfer  may  vary  during  die  computation;  however,  the 
number  of  tasks  and  their  potential  communication  patterns  are  known.  This  permits  a  static 
mapping  of  tasks  onto  processors  (Hoare  [78]). 

Static  Scheduling  Techniques 


Graph  Theoretic  Integer  Programming  Clustering  Partitioning 

Techniques  Techniques  Techniques  Algorithms 

(Stone  77)  (Gylys  &  Edwards  76)  (Williams  83, 84)  (Bergu  &  Bokhari  87) 

Fig.  2 

Static  algorithms  have  been  subject  of  extensive  study.  They  are  relatively  easy  to  design  and 
have  proven  more  tractable  than  dynamic  algorithms.  Their  main  disadvantages,  however,  are  that 
they  are  often  inflexible  and  cannot  adapt  to  the  dynamics  of  the  environment 

Many  approaches  to  the  solution  of  static  scheduling  problems  have  been  proposed,  some  of 
which  are  shown  in  the  figure  above. 

ii.  Dynamic  Scheduling  which  calls  for  tasks  to  be  scheduled  as  they  arrive.  A  parallel 
computation  is  defined  in  terms  of  a  dynamically  created  task  precedence  graph.  New  tasks  are 
initiated  and  existing  tasks  terminate  as  the  computation  unfolds,  and  the  mapping  of  tasks  onto 
processors  is  done  dynamically.  Dynamically  created  tasks  must  be  assigned  to  processors  in  real 
time  by  a  scheduling  algorithm  executing  on  the  multicomputer  system.  This  dynamic  view  of 
computation  differs  in  several  significant  ways  from  the  static  view.  For  example,  the  workload  is 


Simulated 

Annealing 

(Flower  etal.  86) 
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allowed  to  vary  over  time,  multiple  tasks  may  become  eligible  for  execution  given  the  results  from  a 
single  task,  and  tasks  are  dynamically  mapped  onto  processors  using  only  partial  knowledge  of  the 
global  system  state. 

Algorithms  for  dynamic  task  scheduling  may  fall  in  one  of  two  main  categories: 

a)  Task  Placement  Algorithms  where  tasks  are  assigned  to  processors  before  the  tasks  begin 

execution,  and  all  tasks  execute  where  placed  even  though  moving  a  task  later  might  reduce  the 
load  imbalance. 

b)  Task  Migration  Algorithms  where  tasks  are  also  assigned  to  processors  before  they  begin 
execution;  however,  tasks  can  move  after  their  initial  placement 

Whether  placement  or  migration  is  superior  depends  on  the  structure  of  the  parallel  computation. 
Several  approaches  to  the  solution  of  the  dynamic  scheduling  problem  have  been  introduced,  some  of 
which  can  be  seen  in  the  figure  below. 


Dynamic  Scheduling  Techniques 
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Fig.  3 

Systems  based  on  dynamic  scheduling  algorithms  are  flexible  and  can  adapt  to  the  dynamics  of 
the  environment  easily. 

E.  Performance  Measures 

To  compare  the  performance  of  one  computer  system  to  the  performance  of  another  or  to  some 
standard,  a  wide  range  of  techniques  is  used  depending  on  the  particular  situation.  An  important  first 
step,  however,  is  deciding  exactly  what  aspects  of  the  system's  performance  need  to  get  measured 
and  under  what  criteria.  Performance  measures  that  are  in  use  may  be  grouped  into  two  categories; 
job-oriented  and  system-oriented  ones. 

1.  Job-Oriented  Measures: 

These  measures  express  performance  from  the  perspective  of  the  tasks  (or  users).  They  are 
closely  allied  to  the  goal  of  user  satisfaction.  Examples  of  these  measures  are: 

(i)  Flowtime:  This  is  the  length  of  the  time  interval  from  submission  to  completion  for  a  particular 
job.  Flowtime  is  also  frequently  referred  to  as  the  turnaround  time  of  the  job.  In  most  scheduling 
problems,  the  following  relationship  is  assumed  to  hold 
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Flow  Response  Waiting  Makespan  Throughput  Utilization  Idle  Speedup 

Time  Time  Time  Ratio  Time 

fig.  4  Some  Performance  Measures  For  Computing  Systems 

2.  System-Oriented  Measures: 

These  are  measures  that  take  into  account  the  system's  point  of  view  rather  than  that  of  the  task  or 
user.  Several  such  measures  exist,  such  as: 

(i)  The  system's  throughput  which  is  the  number  of  jobs  that  are  completed  per  unit  time.  It  is  a 
measure  of  how  much  work  a  computing  system  is  performing  in  a  given  time  slot.  It  is 
important,  when  using  this  measure  to  compare  algorithms,  to  perform  tests  with  the  same  or  very 
similar  sets  of  jobs.  Otherwise,  throughput  can  be  a  misleading  measure. 

(ii)  The  system's  utilization  ratio  which  is  normally  expressed  as  the  fraction  (or  percent)  of  the 
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total  time  that  each  processor  is  kept  busy  (or  the  average  over  the  processors  of  the  system). 
Thus,  utilization  can  be  expressed  as  the  ratio  of  processing  time  to  available  time  for  each 
processor  (or  for  the  overall  system).  Essentially,  the  idea  is  to  keep  each  processor  as  busy  as 
possible.  In  doing  so,  we  must  avoid  putting  heavy  load  on  some  processors  and  light  load  on 
others,  i.e.  imbalance  in  load  distribution.  Load  balancing  (and  sharing)  techniques  are  used  to 
equalize  the  utilization  of  the  various  processors  in  the  system.  These  techniques  are  sometimes 
treated  as  scheduling  strategies,  with  the  purpose  of  improving  the  utilization  of  the  system's 
resources. 

(iii)  The  idle  time  of  a  processor  is  the  time  interval  when  the  processor  is  available  for  processing, 
but  not  being  used.  Thus,  the  idle  time  of  machine  Mi,  Ij,  is  given  by 

n 

^i=  Cmax  -  Pjj> 

j=i 

where  Cmax  is  the  makespan  and  the  summation  represents  the  total  processing  time  on  Mj. 
The  average  idle  time  of  the  overall  system  is  often  considered.  We  may  seek  to  minimize  the 
total  idle  time,  the  weighted  sum  of  idle  time,  or  the  mean  idle  time  of  the  system.  This  measure 
is  closely  related  to  the  previous  one. 

(iv)  Another  performance  measure,  which  is  used  to  indicate  the  improvement  achieved  in  a  system 
due  to  a  specific  procedure  or  due  to  a  variation  in  the  system  itself,  is  the  speedup.  Speedup 
can  be  affected  by  several  factors,  such  as  the  number  of  processors,  the  type  of  jobs,  the 
interconnection  between  processors,  the  scheduling  policy  used,  etc..  Speedup  is  usually 
evaluated  for  one  of  these  parameters  at  a  time,  keeping  the  others  constant  When  the  same  set  of 
jobs  is  executed  on  a  single-  and  a  multiple-  processor  system,  an  appropriate  definition  of 
speedup  may  be 

S  _  Sequential  Processing  time 
~  Concurrent  Processing  time  ' 

F.  Optimality  Criteria  and  Strategies 

1.  General  Overview: 

There  are  numerous,  complex,  and  often  conflicting  objectives  that  are  to  be  achieved  in 
performance  evaluation  studies.  Many  criteria  have  been  suggested  for  comparing  systems  and 
algorithms  and  judging  their  effectiveness.  Which  characteristics  are  used  for  comparison  can  make  a 
substantial  difference  in  the  determination  of  the  best  system  or  algorithm. 

In  general,  let  Xj  be  any  quantity  associated  with  job  Jj.  Then,  one  of  the  following  measures 
might  be  of  interest: 

n 

X=  2  Xj  Sum  of  Xj  for  all  n  jobs; 

j=l 
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Average  (mean)  value  over  all  jobs; 


n 

X=  1  I  Xj 
n  j=l 

Xmax  =  mar  { Xj }  Maximum  (peak)  value  over  all  jobs; 

lSjiSn 

n 

Xw=  I  Wj  Xj  Total  weighted  sum  over  all  jobs,  with 

j-1  wj  bein^  the  weighting  factors  usually 

summing  to  1. 

The  quantities  defined  above  are  simple  objective  functions  or  criteria  of  performance.  More 
general  objectives  are  constructed  with  nonlinear  penalty  functions  fj  (Xj),  such  as 

n 

X=Sfj  (Xj)  Total  penalty 

j-1 

X^  =  max  { fj  (Xj) }  Maximum  penalty 

l£j£n 

In  any  of  the  above  cases,  we  have  single  objectives.  It  is  also  possible  to  extend  thir  3 
multiple  objectives  schemes  by: 

a)  forming  a  composite  objective  such  as  Y  +  Z,  with  Y  and  Z  being  two  single  objectives;  or 

* 

b)  using  primary  and  secondary  objectives;  for  instance,  by  minimizing  Y  subject  to  Z  =  Z  , 

% 

where  Z  is  the  primary  objective  and  Z  is  some  optimized  value  while  Y  is  the  secondary 
objective. 


Objectives 


Simple  General  Composite  Primary  &  Secondary 

Fig.  5  Performance  Evaluation  Objectives 
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A  policy  is  said  to  be  optimal  with  respect  to  certain  performance  measure  if  it  belongs  to  an 
equivalence  class  such  that  no  nonempty  classes  which  are  preferred  over  this  class  exist 

If  the  performance  metrics  are  regarded  as  cost  (or  reward)  functions,  then  the  optimality  criterion 
might  be: 

(1)  to  minimize  the  maximum  cost  ( minimax )  -  for  instance,  to  minimize  the  maximum  completion 

tin®:  Qnax  =  max  (Cj) 

l£j£n 

(2)  to  maximize  the  average  reward,  such  as,  to  maximize  the  average  number  of  jobs  being 
processed  at  a  time  t 

(3)  to  minimize  the  total  cost  (minisum),  as  in  the  case  of  minimizing  the  total  completion  time: 

n 

c=  2  Cj 
j=i 

In  a  given  application,  once  the  criteria  have  been  defined,  it  is  possible  to  evaluate  the  various 
algorithms  under  consideration.  Table  1  shows  some  of  the  quantities  of  interest 

2.  A  Note  On  Relations  Between  Criteria: 

Some  criteria  happen  to  be  equivalent  in  the  sense  that  a  policy  that  is  optimal  with  respect  to  one 
of  them  is  also  optimal  with  respect  to  the  others).  Knowing  these  equivalence  relations  reduces  the 
number  of  separate  problems  that  we  have  to  deal  with.  Let  us  consider  some  of  these  cases.  In 
section  B  we  defined  several  time  quantities  related  to  the  description  of  a  given  job  Jj  (See  also  fig. 
1).  Based  on  that  information,  the  following  relationships  hold  for  each  job: 

Lj  =  Fj  -  aj  =  Cj  -  rj  -  aj  =  Cj  -  dj  (by  definition) 

For  n  jobs: 

I  Lj  - 1  Fj  -  Z  aj  =  X  Cj  -  S  rj  -  S  aj  =  X  Cj  -  X  dj 
Hence,  L=F-a=C-r-a=C-d 

Now,  if  a  ,  r  ,  d  are  given  constants  and  a  policy  is  optimal  with  respect  to  F  ,  then  it  will  also 
be  optimal  for  C  and  L  . 

If  rj  =  0  for  all  jobs,  then  Cj  =  Fj,  and  Cmax  and  Fmax  are  identical.  However,  Fmax  and  Cmax  are 
quite  distinct  performance  measures  while  C  and  F  are  essentially  equivalent 

The  average  waiting  time  is  also  related  to  these  measures  since 

m 

Cj=  rj  +Wj  +  Pij  =rj  +dj, 

1=1 

Therefore, 
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C=f  +  W  +  l££Pij=?+F  =  L+d 
i«li=l 

Notice  that  r  ,  d  ,  P  are  constants  for  a  given  problem  and  independent  of  the  policy.  Thus,  we 
minimize  C  ,  W  ,  and  L  by  minimizing  F  ,  and  the  four  measures  are  equivalent.  The  same  can 

be  said  about  C,  F,  W,  and  L.  We  should  note  that  there  is  no  parallel  result  concerning  Cmax>  Fmax> 
Lmax,  and  Wmax.  Of  course,  there  are  special  cases  where  two  of  these  measures  are  equivalent  (for 
instance,  rj  =  0  =>  Cmax  -  FmaxJ  and  dj  =  D  =*  Cmax  3  Unax)  but  in  general  they  are  not. 

If  we  minimize  the  final  completion  time  of  all  the  jobs,  Cmax»  then  the  average  number  of 
processors  being  used  at  any  one  time  is  maximized  and  the  average  idle  time  of  a  processor  is 
minimized,  i.e.  Cmax»  Np  ,  I  are  equivalent  measures. 

We  conclude  from  the  above  discussion  that  the  following  criteria  serve  to  represent  all  regular 
measures  introduced  so  far: 


C,  Cmax»  Cw,  Lmax>  T,  Tw 


By  a  regular  measure  we  mean  a  value  that  can  be  expressed  as  a  nondecreasing  function  in  the 
completion  times  of  the  job,  i.e.  a  regular  measure  R  is  a  real  function  of  Ci,  C2, ....  C„: 

R  =  f(C|,  C2, ....  C„)  _such  that  C|  £  Ci', ....  Cn  £  Cn'  implies  that  R  £  R**_fCCi',  C2’, ...,  C„'). 
Cmax.  C  ,  F  ,  Fmax,  L ,  Lmax.  T  .  Tmax.  U  arc  regular  measures.  However,  E  and  Emax  are  not 

regular  measures. 


HL  CQNCLUSIQN. 

In  this  report  we  have  presented  a  summarized  discussion  of  a  number  of  significant  factors  that 
will  have  to  be  considered  in  any  study  involving  performance  evaluation  of  distributed  computing 
systems.  We  also  have  identified  some  of  the  major  problems  which  influence  the  performance  of 
such  systems.  These  factors  and  problems  are  introduced  under  six  fundamental  areas.  These  areas 
are:  system’s  architecture  and  environment,  application  used  as  defined  in  terms  of  workload 
specification  and  characterization,  types  of  constraints  that  exist  in  the  system’s  hardware  or  software, 
types  of  service  policies  that  may  be  used  at  a  computing  facility,  some  typical  performance  metrics, 
and  finally,  optimality  criteria  and  the  relations  between  some  of  them. 

Clearly,  much  more  research  is  needed  to  make  use  of  the  ideas  introduced  here.  In  any  field,  the 
identification  and  systematic  classification  of  key  ideas  and  issues  are  essential  tasks  for  progress. 
Based  on  the  issues  and  problems  presented  in  this  report,  it  is  hoped  that  rigorous  taxonomies  can  be 
developed  to  help  determine  standard  terminology  for  performance  evaluation  of  distributed 
computing  systems.  This  rigorous  taxonomy  should  be  both  exhaustive  of  the  categories  and 
exclusive  with  no  overlap  of  die  classification  criteria  defined.  This  in  turn,  will  provide  an 
unambiguous  categorization  for  every  concept  presented  to  the  taxonomy. 
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Maximum  tardiness 

L 

iiLi 

n  rr  J 

J=1 
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U~ 
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~~  1  N,/t)  dt 
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Weighted  number  of  tardy  jobs 

Nu 

Average  number  of  jobs  still  to  be 
completed  by  time  t  over  the  time  period 

Table  1  Some  Relevant  Optimality  Parameters 
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THE  EFFECTS  OF  ARRAY  BANDWIDTH  CK  FOLSE  RADAR 
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ABSTRACT 

The  basic  principles  governing  phase  scanning  of  array  antennas  are 
briefly  outlined.  Examination  of  these  principles  shows  that  a  frequency 
dependence  in  the  phase  steering  equation  leads  to  an  inability  of  a  phase 
scanned  array  to  radiate  all  frequencies  in  a  single  direction.  Using  this 
fact  the  concept  of  array  bandwidth  is  introduced  and  an  appropriate 
definition  given.  By  adopting  a  linear  system  representation  for  the  antenna, 
it  is  illustrated  how  the  resulting  loss  of  signal  energy  at  the  receiver  can 
be  conveniently  calculated.  Having  defined  the  problem  and  its  major  effect 
on  radar  performance,  the  compensation  technique  of  time-delay  subarraying  is 
discussed.  Special  consideration  is  given  to  systems  employing  linear 
frequency  modulation  pulse  compression.  Plots  of  the  nimber  of  subarrays 
required  to  maintain  a  certain  level  of  radar  performance  vs  maximum  scan 
angle  aie  given  for  various  system  parameters. 


THE  EFFECTS  OF  ARRAY  BANDWIDTH  CN  PULSE  RADAR 
PERFORMANCE  AND  TTME-DELAYH)  SUBARRAY  CCMPENSATION 

Charles  T.  Widener 

I.  Introduction 

It  is  well  known  in  the  theory  of  array  antennas  that  the  main  beam  of 
the  antenna  can  be  steered  in  space  (scanned)  by  the  application  of  a  linear 
phase  progression  across  the  antenna  aperture.  At  a  frequency  f  and 
wavelength  X,  the  phase  required  on  an  element  a  distance  x  from  the 
antenna  center  to  steer  the  beam  to  an  angle  8  from  broadside  is 

+(j?,6)  «-2£x  sin0»  sin0  (l) 

A  C 

where  c  is  the  velocity  of  propagation.  For  elements  arranged  ir>  a 
line,  with  uniform  separation  d,  the  phase  difference  fran  one  element  to 
another  is  given  by 

A*  «  dsin0  «  -^dsinfl  (2) 

A  C 

The  angle  0  to  which  the  beam  points  is  changed  by  adjusting  the  value  of  t 
on  each  element.  If  the  applied  phase  4  on  each  element  is  allowed  only 
values  between  0  and  2%  then  the  array  is  said  to  be  phase  scanned.  Devices 
used  to  provide  this  kind  of  phasing  are  called  phase  shifters.  Phase  scanned 
arrays  are  frequency  sensitive,  as  may  be  seen  from  an  inspection  of  eq.  (1) 
or  (2)  as  follows.  If  the  phase  t  on  each  element  is  held  constant  and  has 
been  adjusted  to  provide  a  scan  angle  0  for  a  given  operating  frequency  f, 
then  since  t  is  constant  so  is  the  quantity  f  sin  0.  If  f  sin  8  equals  a 
constant  quantity,  then  any  change  in  f  causes  a  change  in  sin  8  such  that 
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their  product  remains  unchanged.  The  resulting  effect  is  that  for  phase 
scanned  arrays,  a  change  in  frequency  causes  a  shift  in  scan  angle. 

Time-delaying  is  another  method  that  can  be  used  to  scan  an  array. 
Time-delay  scanning  uses  delay  lines  instead  of  phase  shifters  to  provide  the 
necessary  phasing  an  each  element.  A  delay  line  in  its  simplest  form  is  just 
a  piece  of  transmission  line.  The  length  of  the  delay  line  is  adjusted  to 
provide  a  time  delay  of  t=(d/c)sin  8  (see  Fig.  1)  from  element  to  element. 
With  time-delay  scanning  the  scan  angle  is  independent  of  frequency. 


Fig.  1  Phase  relationships  for  Beam  Steering 

A  significant  distinction  between  phase  shift  and  time  delay 
scanning  can  be  seen  by  noting  that  phase  shifters  provide  a 
total  phase  shift  of  0s$< 2*  and  delay  lines  provide  a  phase 
shift  of 
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(3) 


^  =  2s*^ 

for  a  transmission  line  of  length  1.  If  1  is  many  times  A,  then  the  resultant 
phase  is  many  multiples  of  2x.  This  will  always  be  the  case  for  antennas  of 
practical  size  and  beamwidth. 

The  phase  relationships  given  by  eqs.  (1)  and  (2)  are  derived  by 
assigning  each  radiating  element  of  the  array  operates  in  a  single  frequency  OH 
mode.  Most  radars,  however,  operate  in  a  pulsed  mode  rather  than  CW.  The 
frequency  content  of  a  pulsed  waveform  consists  of  a  spec  tram  which  extends 
over  a  frequency  band  Afp  (the  subscript  denotes  pulse)  centered  around  the 
transmitted  frequency  fQ.  When  an  array  antenna  is  phase  scanned  to  an  angle 
0O,  corresponding  to  the  frequency  f^,  the  frequency  dependence  of  #  will 
cause  frequencies  other  than  fQ  to  scan  to  angles  other  than  80.  The  amount 
of  angular  deviation  A 8  from  8Q  for  a  frequency  of  f0+Af  is  given  by 

A0  -  -M.  tan  0O  (4) 

tQ 

This  is  known  as  the  aperture  effect.  A  similar  phenomenon  occurs  when 
each  element  of  the  array  is  fed  by  a  transmission  line  of  differing  length, 
such  as  in  series  fed  arrays.  In  this  case,  the  phase  of  excitation  at  each 
element  will  vary  according  to  the  length  of  the  line  and  frequency  used 
(exactly  like  time-delay  scanning).  The  added  phase  variation  causes  a 
further  deviation  in  scan  angle.  This  is  known  as  a  feed  effect.  The  two 
effects  are  additive  and  can  be  understood  independent 1 y  of  one  another.  For 
the  remainder  of  this  report  it  will  be  assured  that  all  radiating  elements 
are  fed  in  parallel  by  equal  line  lengths  (also  known  as  corporate  fed)  so 
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that  feed  effects  can  be  ignored. 

The  definition  of  array  bandwidth  for  pulsed  waveforms,  as  proposed  by 
Frank  [1],  is  that  range  of  frequencies,  4£^,  (the  subscript  a  denoting  array) 
which  causes  the  angular  deviation  48  to  lie  within  the  half  power  beanwidth 
6gpg  defined  for  CW  operation  chi  the  same  array.  This  report  considers  the 
limitations  on  the  ability  of  a  phase  scanned  array  to  transmit  (and  receive) 
a  pulse  spectrun  due  to  the  array  bandwidth  effect  and  the  resulting  effects 
an  radar  performance. 

II.  Array  Bandwidth  Limitations 

One  obvious  consequence  of  the  angular  dispersion  described  by  equation 
(4)  is  that  same  of  the  energy  intended  to  radiated  in  a  direction  9  will  go 
elsewhere.  The  result  is  a  loss  of  energy  an  target.  This  loss  is  sometimes 
referred  to  as  one-way  loss  since  only  the  transmit  portion  of  the  round  trip 
is  considered. 

The  loss  of  energy  on  target  can  be  calculated  by  considering  the 
scanned  array  as  a  frequency  dependent  network,  which  has  an  additional 
dependence  an  8.  The  input  to  the  array  is  the  transmitted  pulse,  denoted 
s(t)  in  the  time  domain.  The  output  of  the  array  y(t,9)  will  be  the 
convolution  of  the  input  signal  with  the  impulse  response  of  the  array 
a(t,0)  given  by  the  expression: 

y(t,0)  *  J*s(t)a( t-T,8)  dt  (5) 

Since  convolution  in  the  time  domain  implies  multiplication  in  the 
frequency  domain,  the  output  spectrum  Y(»,8)  takes  the  form 
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y(»,6)  -  s(wu(«,e) 


(6) 


The  energy  of  the  output  waveform  at  a  scan  angle  0  over  the  array 
bandwidth  A  f,  defined  above  is  given  by  the  expression 

*(0)*^.  *,"lS(»)A<«,0)  |a  d»  (7) 

The  loss  in  energy  due  to  array  bandwidth  limitations  can  be  expressed 
as  the  ratio  of  the  energy  on  target  at  a  scan  angle  8  to  the  energy  an  target 
for  a  scan  angle  8=0*, 

f  (5(u)  A(u,6)  j3  du 

Loas  ratio*-* - - -  (8) 

J|S(«)  j3  ef» 

This  loss  has  been  computed  by  several  authors,  notably  Frank  [1],  Hammer  [2], 
and  Adams  [3] . 

An  equivalent  way  of  understanding  the  loss  mechanism  for  broadband 
phased-arrays  is  in  terns  of  the  aperture  fill  tins.  As  shown  in  Fig.  2,  if  a 
signal  is  incident  from  0O,  the  pulse  front  reaches  one  end  of  the  array 
before  the  other.  The  signal  travels  a  distance  L  sin8Q  farther  to  the  last 
element  than  to  the  first.  The  time  it  takes  for  the  signal  to  be  present  in 
all  elements  of  the  array  is  the  aperture  fill  time  T=(L/c)sin80.  It  is 
readily  shown  that  the  array  bandwidth  can  be  expressed  in  terms  of  the 
aperture  fill  time  as  Afj*l/T. 

It  should  be  clear  that  if  the  array  bandwidth  is  larger  than  the 
bandwidth  of  the  pulse  waveform  Af^,  at  a  particular  scan  angle,  then  the 
energy  loss  will  be  small.  As  the  array  bandwidth  narrows  due  to  scanning. 
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V" 

A 


c 


Fig.  2  Aperture  Fill  Time 

more  of  the  pulse  energy  is  lost.  Since  pulse  length  x  is  related  to  the 
pulse  bandwidth  by  the  relation  t$l/Afjp,  the  ratio  T/x  can  be  used  as  a 
reference  for  energy  loss.  Figure  3  shows  a  plot  of  loss  in  gain  due  to  one 


0.2  0.4 


0.0  0.0  t.O  1.2  1.4 


Fig  3.  Loss  of  Gain  due  to  Scanning  (after  Frank) 
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way  loss  of  energy  as  a  function  of  the  ratio  T/t  for  a  rectangular  pulse 
spectrum,  assuming  a  uniformly  illuminated  aperture  (after  Prank). 

III.  System  Degradation 

While  the  less  of  energy  an  target  for  a  phase  scanned  array  can  account 
partially  for  radar  performance  degradation,  it  is  more  meaningful  to  consider 
the  entire  signal  round  trip,  i.e.,  both  transmit  and  receive.  In  the 
transmit  mode,  each  array  element  is  excited  at  the  same  time  (not  true  for  a 
time-delay  scanned  array) .  If  a  target  P  is  located  at  some  angle  80  off 
broadside,  then  the  signals  fran  each  element  will  arrive  at  the  target 
slightly  displaced  in  time  by 

td  *  ^  sin 0O  (9) 

Each  elemental  signal  will  be  reflected  and  returned  to  the  array  and 
received  by  all  the  elements,  suffering  the  same  time  delay  effect  an  receive 
as  on  transmit.  Two  effects  are  inmediately  clear: 

1)  the  received  pulse  is  smeared  in  time  (spreading  occurs)  and 

2)  the  amplitude  of  the  received  signal  is  distorted  compared  to  the 
transmitted  signal. 

The  result  of  these  effects  is  a  loss  of  signal  energy  at  the  receiver. 
This  loss  is  sometimes  called  a  two  way  loss  because  it  includes  losses 
incurred  in  the  signal  round-trip.  Calculation  of  the  two  way  loss  is  again 
treated  conveniently  in  the  frequency  domain  as  a  problem  in  linear  systems. 
The  spectrum  of  the  transmitter  output  pulse  is  first  modified  (or 
filtered)  by  the  transmission  transfer  function  of  the  antenna  array.  The 
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resulting  transmitted  spectrum  is  then  reflected  off  a  target  and  upon 
reception,  is  modified  by  the  receive  transfer  function  of  the  antenna  array. 
*Note:  The  transmit  and  receive  patterns  of  the  array  may  employ 
differ  art  aperture  weightings  to  maximize  certain  features  of  each.  For 
example,  uniform  weighting  on  transmit  allows  maximum  energy  an  target, 
while  a  tapered  weighting  on  receive  may  be  used  to  maintain  sidelobes 
at  seme  predetermined  low  level. 

The  resulting  spectrum  incident  upon  the  receiver  would  be 

y(w,0o)  »  S(w)  Ar(co,eo)  A;,(tt,80)  (10) 

where  Ay  and  Ag  are  respectively  the  transmit  and  receive  transfer 
functions  of  the  antenna  array.  An  appropriate  matched  filter  would  have  as 
its  transfer  function  [S(e)Ay(«  ,0o)Aj(e  ,©„)]*,  where  *  denotes  complex 
conjugate.  This  particular  implementation  is  dependent  on  scan  angle  0  and 
would  be  difficult  to  achieve  in  practice.  A  more  likely,  simpler  receiver 
would  use  a  filter  matched  to  S(«)  alone.  Signal -to-noise  ratio  (a®)  loss 
curves  for  both  cases  (receiver  matched  to  S(«),  and  matched  to  Y(»,0O)*  ) 
have  been  presented  in  an  excellent  paper  by  J.  R.  Sklar  [4]  for  a  uniformly 
weighted  aperture,  and  by  Rothenberg  and  Schwartzman  [5]  for  two  Taylor 
weighted  apertures.  A  plot  of  SNR  loss  for  the  uniformly  weighted  case  is 
shown  in  Figure  4  (after  Sklar).  A  simple  closed  form  expression  describing 
the  SNR  loss  due  to  array  bandwidth  limitations  is  given  by  Barton  [6]  as 

——  (11) 

b; 

where:  Lj  is  the  loss  factor 
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Ba  is  the  half -power  bandwidth  of  the  array 
and  B3  is  the  width  of  the  receiver  output  filter  (assumed 
narrowband) . 


Fig.  4  SNR  loss  vs.  T/x  (after  Sklar) 

Additional  performance  degradations,  also  discussed  by  Sklar,  are  loss 
of  range  resolution  and  loss  of  range  accuracy.  These  degradations  are  mainly 
the  result  of  received  signal  spreading  in  time. 

IV.  Time-Delayed  Subarrays 

As  component  technology  and  radar  theory  becomes  more  and  more  advanced, 
the  trend  has  been  to  build  radars  with  higher  resolution  capabilities  in  both 
range  and  azimuth  (cross -range).  Requirements  for  fine  range  resolution 
include  the  use  of  a  short  pulse  (wide  bandwidth);  similarly,  as  more 
resolution  in  azimuth  is  required,  it  is  necessary  to  use  narrow  beamwidths. 
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which  implies  array  dimensions  much,  much  greater  than  the  wavelength  (L»A ) . 
Unfortunately,  array  bandwidth  effects  become  more  pronounced  when  a  short 
pulse  is  used  with  a  large  array. 

One  method  of  overcoming  performance  degradation  due  to  array  bandwidth 
effects  is  through  the  use  of  time-delayed  subarrays.  Time-delaying,  already 
mentioned  as  one  form  of  scanning,  is  currently  prohibitively  costly  and 
difficult  to  implement  on  a  per  element  basis.  But  it  can  be  applied  to  small 
portions  of  the  overall  antenna  (called  subarrays)  as  an  effective  way  to 
increase  array  bandwidth,  and  thereby  decrease  the  SNR  loss. 

An  easy  way  to  understand  the  increase  in  bandwidth  using  time-delayed 
subarrays  is  to  note  that  signal  returns  for  each  subarray  are  in  time 
coincidence,  the  subarrays  having  been  essentially  "stacked"  in  time. 
Computation  of  array  bandwidth  is  then  done  using  the  length  of  a  single 
subarray  section,  1^.  Since  the  bandwidth  of  a  subarray,  is  proportional  to 
X/L,,  the  shorter  the  length  of  the  subarray,  the  wider  the  bandwidth 
becomes.  Comparison  of  subarray  bandwidth  to  array  bandwidth  indicates  a 
factor  of  N  increase  when  N  subarrays  are  used.  This  provides  an  excellent 
means  of  decreasing  the  SNR  degradation  that  accompanies  phase  scanned  arrays 
when  used  with  wideband  pulses. 

When  it  is  necessary  to  employ  subarraying  to  avoid  excessive  SNR  loss, 
the  primary  factors  for  consideration  are: 

a)  pulsewidth  t 

b)  overall  antenna  length  L 

c)  maximum  scan  angle  0Q 

d)  maximum  SNR  degradation  which  can  be  tolerated,  and 
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e)  antenna  aperture  weighting  taper(s)  employed. 

The  pulsewidth  t  and  antenna  length  L  can  be  conveniently  expressed  together 
as  the  ratio  t/T0 ,  where  Tfl  is  the  transit  time  of  L  by  the  speed  of  light, 
i.e.  T0=L/c. 

It  is  clear  from  Fig.  4  (for  the  case  of  uniform  weightings)  that  when 
T/t  is  greater  than  3,  the  resulting  SNR  loss  is  less  than  1  dB.  Using  1  dB 
as  the  maximum  tolerable  loss  in  SNR,  the  minimum  number  of  subarrays  required 
is  graphically  represented  in  Figure  5  as  a  function  of  maximum  scan  angle. 


K-RATIO  OF  PULSEWIDTH  TO  TRANSIT  TIME 


Fig.  5  Minimum  Number  of  Subarrays  Required  for 

1  dB  or  less  SNR  loss  vs  Max  Scan  Angle  for 
t/T0=  1.0,  0.5,  0.2,  0.1,  0.05 


The  curves  are  piotted  for  various  ratios  of  t/T3  when  the  receiver  is 
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matched  to  the  antenna  input  pulse  spectrum  S(e).  If  the  tolerable  SNR  loss 
were  more  (or  less)  than  1  dB,  similar  charts  could  be  made  to  conform  to  that 
criteria. 

V.  Pulse  Compression  Considerations 

The  effects  of  array  bandwidth  limitations  an  radar  performance  have 
been  presented  for  the  special  case  of  single  frequency,  pulse  mode  operation. 
While  pulsed  radars  are  typical  in  modem  day  use,  modes  other  than  single 
frequency  are  very  cartmon  as  well.  It  was  pointed  out  in  section  IV  that  the 
trend  today  is  towards  higher  resolution  capability.  In  this  context  it  was 
mentioned  the  use  of  a  short  pulse  is  one  method  used  to  achieve  this  goal . 
However,  as  is  usually  the  case,  pushing  one  parameter  to  its  limit  generally 
reflects  adversely  in  another  parameter.  Short  pulse  modes  are  no  exception. 
While  short  pulse  modes  can  increase  the  resolution  in  range  they  inherently 
put  less  energy  on  target  which  decreases  the  maximum  detection  range.  Pulse 
compression  is  a  technique  that  employs  a  modulated  transmitter  waveform  that 
maintains  a  high  level  of  energy  on  target  while  achieving  the  high  range 
resolution  associated  with  a  short  pulse.  TVo  transmitter  modulations 
commonly  used  are  linear  frequency  modulation  (LEW)  and  phase  coding.  Both  of 
these  modulations  will  interact  with  the  scanning  mechanisms  of  phase  scanned 
arrays.  An  analysis  by  J.  B.  Payne  (7]  indicates  that  a  phase  scanned 
antenna's  response  to  a  pulse  compression  waveform  is  identical  to  its 
response  to  a  pulse  equal  in  width  to  that  of  the  compressed  pulsewidth.  One 
might  assume  that  the  curves  presented  in  Figure  5  could  be  used  equally  as 
well  for  a  LEM  waveform  with  x  replaced  by  where  fj  and  f,  are  the 
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limits  of  the  frequency  modulation  of  the  LEM  waveform.  Unfortunately  this  is 
too  coarse  an  assumption. 

Regarding  Payne's  analysis  with  respect  to  LEM  pulse  c oppression,  it 
needs  to  be  mentioned  that  his  analysis  assumes  a  matched  filter  receiver.  It 
is  well  known  [8-9]  that  the  matched  filter  output  to  an  LEM  pulse  has 
undesirable  time  sidelobes  which  can  mask  a  weak  target  or  be  mistaken  for 
targets  themselves.  These  sidelobes  are  reduced  by  amplitude  weighting  the 
received  signal  spectrum  within  the  filter,  much  as  an  antenna  aperture  is 
weighted  to  reduce  sidelobes.  The  resulting  loss  due  to  filter  mismatch  is  an 
the  order  of  1  to  2  dB  for  most  practical  weightings. 

Khittel  [10],  however,  has  done  a  comprehensive  study  on  the  SNR  loss 
and  range  resolution  degradation  resulting  from  array  dispersion  for  systems 
using  LEM  pulse  compression.  His  results  are  based  on  a  comparison  to  an 
ideal  system  having  a  dispersionless  array  and  a  receiver  weighting  filter 
designed  for  40  dB  time  sidelobes  (Taylor  n=8).  His  analysis  shows  that  there 
is  an  optimum  pulse  compression  bandwidth  which  minimizes  both  SNR  loss  and 
pulse  shape  distortion.  Although  as  an  example  he  has  examined  a  system  with 
specific  parameters,  he  has  generalized  his  results  to  include  arbitrary 
aperture  size  D/1 ,  arbitrary  scan  angle  9,  and  arbitrary  fractional  signal 
bandwidth  B=(f2-fO/f0/  where  fQ  is  the  midband  frequency  of  the  pulse 
conpressian  modulation.  It  should  also  be  mentioned  that  Knittel  assures  a 
larger  useable  bandwidth  for  the  array  than  that  proposed  by  Frank. 

Abstracting  from  Knittel' s  SNR  loss  curve  for  a  parallel  feed  array  with 
uniform  weighting  on  transmit  and  Taylor  30  dB  (n=6)  weighting  on  receive,  it 
is  found  that  time-delay  subarraying  should  be  used  if  the  ratio  of  compressed 
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pulsewidth  to  aperture  fill  time  xc/T  is  less  than  0,88,  in  order  to  maintain 
a  SNR  loss  of  less  than  1  dB.  Based  on  this  result,  the  number  of  subarrays 
required  to  maintain  less  than  1  dB  SNR  degradation  vs  maximum  scan  angle  for 


various  values  of  K=tc/T0  is  shown  in  Figure  6. 


K-RATIO  OF  COMPRESSED  PULSEWIDTH  TO 
TRANSIT  TIME 

Fig.  6  Minimum  Number  of  Subarrays  Required  for 
1  dB  or  less  SNR  loss  vs  Max  Scan  Angle  for 
rc/TQ=  0.5,  0.2,  0.1,  0.05 


VI.  Summary 

In  the  preceding  pages  it  has  been  shown  that  when  the  elements  of  an 

antenna  array  are  controlled  by  phase  shift  devices  to  steer  the  antenna  beam 

in  space,  the  direction  of  the  energy  is  dependent  on  the  frequency  of  the 
transmitted  RF  signal.  For  the  case  of  pulsed  radars  it  was  seen  that  due  to 

the  inherent  nature  of  a  pulsed  waveform,  sane  of  the  transmitted  energy  is 
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radiated  in  directions  other  than  the  intended  scan  angle.  This  leads  to  a 
definition  of  a  bandwidth  for  the  array  which  in  essence  describes  a  band  of 
frequencies  which  will  radiate  in  the  intended  direction.  In  sections  II  and 
III  the  resulting  effects  on  radar  performance  due  to  array  bandwidth 
limitations  were  discussed.  The  first  effect  noted  was  a  loss  of  energy  on 
target  due  to  angular  dispersion.  The  second,  and  more  important  effect,  was 
the  loss  of  signal  energy  at  the  receiver  resulting  in  SNR  degradation. 

Curves  for  both  the  one-  and  two-way  loss  were  presented  to  illustrate  their 
dependence  on  pulsewidth  x,  antenna  length  L,  and  scan  angle  6. 

In  Section  IV,  a  condensation  technique  using  time-delayed  subarrays  was 
discussed  and  plots  given  for  the  number  of  subarrays  required  to  maintain 
less  than  1  dB  SNR  degradation  for  various  ratios  of  x/T0  as  a  function  of 
maximum  scan  angle.  Similar  plots  were  presented  in  Section  V  for 
applications  vising  LEM  pulse  compression. 

As  a  final  note,  it  is  mentioned  that  the  subarraying  technique  referred 
to  in  this  report  is  the  conventional  or  contiguous  method.  Other  methods 
such  as  overlapped  and  interleaved  can  also  be  used  to  further  reduce  SNR  loss 
and  minimize  grating  lobe  levels.  Interested  readers  are  referred  to  an 
overview  of  such  techniques  by  R.  Tang  [11]. 
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Abstract 

Tactical  simulation  models  are  often  used  to  assess  vulnerabilities  and  capabilities 
of  combat  systems  and  doctrines.  Due  to  the  complexity  of  tactical  simulation  models, 
it  is  often  difficult  to  assess  the  relationship  between  input  factors  and  the  performance 
of  the  simulation  model.  To  facilitate  this  type  of  assessment,  simulation  analysts 
often  use  the  simulation  model  to  emirically  construct  a  black-box  approximation 
of  the  causad  and  time  dependent  behavior  of  the  simulation  model.  This  type  of 
approximation  is  known  as  a  metamodel  and  can  be  viewed  as  a  summary  of  the 
behavior  of  the  simulation  model.  We  demonstrate  this  technique  in  the  context  of 
an  example  using  TERSM  (Tactical  Electronic  Reconnaissance  Simulation  Model'. 
The  results  indicate  that  metamodeling  is  applicable  to  tactical  simulation  models 
and  that  the  technique  has  a  wide  range  of  uses. 
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